phone +7 (3412) 91 60 92

Archive of Issues

Bulgaria Varna
Section Computer science
Title Creating groups for marketing purposes from website usage data
Author(-s) Sulova S.D.a
Affiliations University of Economics - Varnaa
Abstract Customer grouping and knowledge extraction for these groups are important to online businesses because it allows purposeful application of marketing techniques. Individuals can be personally served with the groups, depending on the identified interests and preferences. In this article, we suggest a way to identify and create user groups by processing website usage data. We use the logs stored in the server log data for the visit to a selected website and then retrieve and process the text content of the visited web pages. The approach is based on the technology for natural language processing and uses the methods for clustering of text documents. The experimental testing of this method is done with the software product RapidMiner and data from visits to a Bulgarian e-shop.
Keywords text clustering, group, text mining, Logfile, RapidMiner
UDC 519.688
MSC 68P20, 68T50
DOI 10.20537/vm170314
Received 1 August 2017
Language English
Citation Sulova S.D. Creating groups for marketing purposes from website usage data, Vestnik Udmurtskogo Universiteta. Matematika. Mekhanika. Komp'yuternye Nauki, 2017, vol. 27, issue 3, pp. 470-478.
  1. Etzioni O. The World-Wide Web: quagmire or gold mine?, Communications of the ACM, 1996, vol. 39, issue 11, pp. 65-68. DOI: 10.1145/240455.240473
  2. Sulova S. Application of web mining in customer relationship managament, Izvestia, Journal of the Union of Scientists - Varna, Economic Sciences Section, 2015, issue 1, pp. 105-110.
  3. Cooley R., Mobasher B., Srivastava J. Web mining: information and pattern discovery on the World Wide Web, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97), IEEE Computer Society, 1997, pp. 558-567. DOI: 10.1109/TAI.1997.632303
  4. Markov Z., Larosed D.T. Data mining the web: uncovering patterns in web content, structure, and usage, New Jersey: John Wiley & Sons, 2007, 218 p.
  5. Kumar E. Natural language processing, New Delhi: I.K. International Publishing House Pvt. Ltd., 2011, 224 p.
  6. Fayyad U., Piatetsky-Shapiro G., Smyth P. From data mining to knowledge discovery in databases, AI Magazine, 1996, vol. 17, no. 3, pp. 37-54. DOI: 10.1609/aimag.v17i3.1230
  7. Fan W., Wallace L., Rich S., Zhang Z. Tapping the power of text mining, Communications of the ACM, 2006, vol. 49, issue 9, pp. 76-82. DOI: 10.1145/1151030.1151032
  8. Pena-Ayala A. Educational data mining. Applications and trends, Heidelberg: Springer International Publishing, 2014, xviii + 468 p. DOI: 10.1007/978-3-319-02738-8
  9. Tarczynski T. Document clustering - concepts, metrics and algorithms, International Journal of Electronics and Telecommunications, 2011, vol. 57, issue 3, pp. 271-277. DOI: 10.2478/v10177-011-0036-5
  10. Hartigan J.A., Wong M.A. Algorithm AS 136: a $k$-means clustering algorithm, Journal of the Royal Statistical Society. Series C (Applied Statistics), 1979, vol. 28, no. 1, pp. 100-108. DOI: 10.2307/2346830
  11. Dixit D., Kiruthika M. Preprocessing of web logs, International Journal on Computer Science and Engineering, 2010, vol. 2, issue 7, pp. 2447-2452.
  12. Wong S.K.M., Raghavan V.V. Vector space model of information retrieval: a reevaluation, SIGIR '84 Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval, 1984, Cambridge, England, pp. 167-185.
  13. Jing L., Ng M.K., Yang X., Huang J.Z. A text clustering system based on $k$-means type subspace clustering and ontology, International Journal of Computer, Electrical, Automation, Control and Information Engineering, 2008, vol. 2, no. 4, pp. 1296-1308.
  14. Steinbach M., Karypis G., Kumar V. A comparison of document clustering techniques, KDD Workshop on Text Mining, 2000.
  15. Antony S., Wagh R. Study on text clustering for topic identification, International Journal of Advanced Research in Computer Science, 2017, vol. 8, no. 1, pp. 161-164.
  16. Linden A., Krensky P., Hare J., Idoine C.J., Sicular S., Vashisth S. Magic quadrant for data science platforms.
  17. Huang A. Similarity measures for text document clustering, Proceedings of the Sixth New Zealand Computer Science Research Student Conference (NZCSRSC2008), University of Canterbury, Christchurch, 2008, pp. 49-56.
  18. Sandhya N., Lalitha Y.S., Govardhan A., Anuradha K. Analysis of similarity measures for text clustering, International Journal of Data Engineering, 2008, vol. 2, issue 4.
  19. Singh A., Yadav A., Rana A. K-means with three different distance metrics, International Journal of Computer Applications, 2013, vol. 67, no. 10, pp. 13-17. DOI: 10.5120/11430-6785
  20. Davies D., Bouldin D.A. A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1979, vol. PAMI-1, issue 2, pp. 224-227. DOI: 10.1109/TPAMI.1979.4766909
Full text
<< Previous article