Statistical and machine learning methods of online behaviours analysis.
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The success of corporates is highly influenced by the effectiveness and appeal of each corporate’s website. This study was conducted on TEKmation, a South African corporate, whose board of directors lacked insight regarding the website’s usage. The study aimed to quantify the web-traffic flow, detect the underlying browsing patterns, and validate the web-design effectiveness. The website experienced 7,935 visits and 57,154 page views from 1 June 2021 to 30 June 2023 (data sourced by Google Analytics). Grubb’s test has identified outliers in visit frequency, the pageviews per visit, and the visit duration per visit. A small degree of missingness was observed on the mobile device branding (1.24%) and operating system (0.03%) features which were imputed using a Bayesian network model. To address a data-shift detected, an artificial neural network (ANN) was proposed to flag future data-shifts with important predictors being the period of year and volume of sessions. Prior to clustering, feature selection methods assessed the feature variability and feature association. Results indicated that low-incidence webpages and features with natural relationships should be omitted. The K-means, DBScan and hierarchical unsupervised machine learning methods were employed to identify the visit personas, labelled get-in-touch (12%), accidentals (11%), dropoffs (30%), engrossed (38%) and seekers (9%). It was evident that the premature drop-offs needed further exploration. The Cox proportional hazards survival model and the random survival forest (RSF) model have identified that the web browser, visit frequency, device category, distance, certain webpages, volume of hits, and organic searches proved to be drop-offs hazards. A tiered Markov chain model was developed to compute the transition probabilities of dropping-off. The contact (63%) and clients (50%) states recorded a high likelihood to drop-off early within the visit. In conclusion, using statistical methods, the study informed the board on of its audience, the flaws of the website and proposed recommendations to address concerns.
Iqoqa.
Impumelelo yamabhizinisi amakhulu ikhuthazwa kakhulu ukusebenza kahle kanye nokubukeka kwewebhusayithi yawo. Lolu cwaningo lwenziwe ngeTEKmation, okuyinkampani enkulu yaseNingizimu Afrika, abaqondisi bayo besigungu ababebonakala bengaenalo ulwazi ngokusetshenziswa kwewebhusayithi. Lolu cwaningo luhlose ukuqopha izikhawu zokugcwala kwabantu kuleyo webhusayithi, ukubheka izindlela zokwenza futhi kubhekwe ukusebenza kahle kwayo. Iwebhusayithi kwangenwa kuyona izikhawu eziyiziyi-7,935 kuya kweziyi-57,154 kusukela zi-1 Nhlangulana 2021 kuze kube zingama-30 Nhlangulana 2023 (imininingo inikezwe nguGoogle Analytics). Ukuhlola ngeGrubb kwaveza okuhlukile ngokungena nokuphuma kanye nesikhathi umuntu asichithayo uma esengenile kuwebhusayithi. Kukhona ulwazi oluncane olushodayo mayelana nokungena kumawebhusayithi kusetshenziswa omakhalekhukhwini (1.24%) nokusetshenziswa kwezindlela zakhona (0.03%) kusetshenziswa indlela iBayesian network. Ukubhekana nesimo sokushintsha kwemininingo, kwaphakanyiswa i-artificial neural network (ANN) ukuhlonza izindlela zokushintshashintsha kwemininingo esikhathini esingangonyaka noma izikhawu eziningi ezahlukene. Ngaphambi kokuhlanganiswa, izinhlobo zezisetshenziswa zakwazi ukuhlohla izisetshenziswa ezahlukene nokuhlobana kwazo. Imiphumela yocwaningo yayiveza ukuthi ukungena ngokugqoza kumawebhusayithi kanye nezisetshenziswa ezijwayelekile kumele kuphele. I-K-means, i-DBScan nezindlela ezahlukene zokufunda ngemishini zasetshenziswa ukubheka ukungena kumawebhusayithi, ngoba befuna ukuthintwa ngabathize (12%), umculo (11%), ukushiya okuthize (30%), nokuthatheka (38%) nababheka okuthize (9%). Kwakucaca ukuthi abafisa ukushiya okuthize kudinga ukuchazwa kangcono. Indlela yokukwazi ukuphila iCox proportional hazards nerandom survival forest (RSF) kwaveza ukuthi ukusebenza kwewebhusayithi, ukungena nokuphuma kuyona, izinhlobo zezobuchwepheshe ezisetshenziswayo, ubude, izinhlobo zamawebhusayithi, izinga lomsindo, nezindlela zokubheka ulwazi, konke kwakhomisa ukuba nobungozi. Indlela yokwenza, itiered Markov chain yaqanjwa ukugcina ulwazi ngokwekhompyutha ukuthi izinto zingehla nini. Ukuxhumana (63%) amakhasimende (50%) kwakuveza ukuthi kunokwehla kwezinombolo zokungena kumawebhusayithi. Uma sekuphethwa, ngokusebenzisa ulwazi lwezinombolo, lolu cwaningo lwazisa ibhodi mayelana nabangenayo, nezinkinga ngamawebhusayithi lwabe seluphakamisa nezincomo okumele zisukunyelwe.
Description
Doctoral Degree. University of KwaZulu-Natal, Durban.