Section
|
Computer science
|
Title
|
Creating groups for marketing purposes from website usage data
|
Author(-s)
|
Sulova S.D.a
|
Affiliations
|
University of Economics - Varnaa
|
Abstract
|
Customer grouping and knowledge extraction for these groups are
important to online businesses because it allows purposeful application of marketing techniques.
Individuals can be personally served with the groups, depending on the identified interests and preferences.
In this article, we suggest a way to identify and create user groups by processing website usage data.
We use the logs stored in the server log data for the visit to a selected website and
then retrieve and process the text content of the visited web pages. The approach is based on the technology for natural language processing and uses the methods for clustering of text documents. The experimental testing of this method is done with the software product RapidMiner and data from visits to a Bulgarian e-shop.
|
Keywords
|
text clustering, group, text mining, Logfile, RapidMiner
|
UDC
|
519.688
|
MSC
|
68P20, 68T50
|
DOI
|
10.20537/vm170314
|
Received
|
1 August 2017
|
Language
|
English
|
Citation
|
Sulova S.D. Creating groups for marketing purposes from website usage data, Vestnik Udmurtskogo Universiteta. Matematika. Mekhanika. Komp'yuternye Nauki, 2017, vol. 27, issue 3, pp. 470-478.
|
References
|
- Etzioni O. The World-Wide Web: quagmire or gold mine?, Communications of the ACM, 1996, vol. 39, issue 11, pp. 65-68. DOI: 10.1145/240455.240473
- Sulova S. Application of web mining in customer relationship managament, Izvestia, Journal of the Union of Scientists - Varna, Economic Sciences Section, 2015, issue 1, pp. 105-110. https://ideas.repec.org/a/vra/journl/y2015i1p105-110.html
- Cooley R., Mobasher B., Srivastava J. Web mining: information and pattern discovery on the World Wide Web, Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence (ICTAI'97), IEEE Computer Society, 1997, pp. 558-567. DOI: 10.1109/TAI.1997.632303
- Markov Z., Larosed D.T. Data mining the web: uncovering patterns in web content, structure, and usage, New Jersey: John Wiley & Sons, 2007, 218 p.
- Kumar E. Natural language processing, New Delhi: I.K. International Publishing House Pvt. Ltd., 2011, 224 p.
- Fayyad U., Piatetsky-Shapiro G., Smyth P. From data mining to knowledge discovery in databases, AI Magazine, 1996, vol. 17, no. 3, pp. 37-54. DOI: 10.1609/aimag.v17i3.1230
- Fan W., Wallace L., Rich S., Zhang Z. Tapping the power of text mining, Communications of the ACM, 2006, vol. 49, issue 9, pp. 76-82. DOI: 10.1145/1151030.1151032
- Pena-Ayala A. Educational data mining. Applications and trends, Heidelberg: Springer International Publishing, 2014, xviii + 468 p. DOI: 10.1007/978-3-319-02738-8
- Tarczynski T. Document clustering - concepts, metrics and algorithms, International Journal of Electronics and Telecommunications, 2011, vol. 57, issue 3, pp. 271-277. DOI: 10.2478/v10177-011-0036-5
- Hartigan J.A., Wong M.A. Algorithm AS 136: a $k$-means clustering algorithm, Journal of the Royal Statistical Society. Series C (Applied Statistics), 1979, vol. 28, no. 1, pp. 100-108. DOI: 10.2307/2346830
- Dixit D., Kiruthika M. Preprocessing of web logs, International Journal on Computer Science and Engineering, 2010, vol. 2, issue 7, pp. 2447-2452. http://www.enggjournals.com/ijcse/doc/IJCSE10-02-07-20.pdf
- Wong S.K.M., Raghavan V.V. Vector space model of information retrieval: a reevaluation, SIGIR '84 Proceedings of the 7th annual international ACM SIGIR conference on Research and development in information retrieval, 1984, Cambridge, England, pp. 167-185. http://dl.acm.org/citation.cfm?id=636816
- Jing L., Ng M.K., Yang X., Huang J.Z. A text clustering system based on $k$-means type subspace clustering and ontology, International Journal of Computer, Electrical, Automation, Control and Information Engineering, 2008, vol. 2, no. 4, pp. 1296-1308. http://waset.org/publications/2401
- Steinbach M., Karypis G., Kumar V. A comparison of document clustering techniques, KDD Workshop on Text Mining, 2000. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.125.9225
- Antony S., Wagh R. Study on text clustering for topic identification, International Journal of Advanced Research in Computer Science, 2017, vol. 8, no. 1, pp. 161-164. http://ijarcs.info/index.php/Ijarcs/article/view/2874
- Linden A., Krensky P., Hare J., Idoine C.J., Sicular S., Vashisth S. Magic quadrant for data science platforms. https://www.gartner.com/doc/3606026/magic-quadrant-data-science-platforms
- Huang A. Similarity measures for text document clustering, Proceedings of the Sixth New Zealand Computer Science Research Student Conference (NZCSRSC2008), University of Canterbury, Christchurch, 2008, pp. 49-56. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.332.4480
- Sandhya N., Lalitha Y.S., Govardhan A., Anuradha K. Analysis of similarity measures for text clustering, International Journal of Data Engineering, 2008, vol. 2, issue 4. http://www.cscjournals.org/manuscript/Journals/IJDE/Volume2/Issue4/IJDE-63.pdf
- Singh A., Yadav A., Rana A. K-means with three different distance metrics, International Journal of Computer Applications, 2013, vol. 67, no. 10, pp. 13-17. DOI: 10.5120/11430-6785
- Davies D., Bouldin D.A. A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, 1979, vol. PAMI-1, issue 2, pp. 224-227. DOI: 10.1109/TPAMI.1979.4766909
|
Full text
|
|