Voyant bjalo ka senolofatši sa phetleko ya dingwalo

Author: Dimakatso Mathe (SADiLaR Sesotho sa Leboa researcher)
English translation for this blog at the bottom.

Lenaneo la go dira diphatišišo le akaretša go tsitsinkela sengwalo ka leihlo la ntšhotšhonono ka maikemišetšo a go utulla se monyakišiši a ratago go se nepiša. Le ge go le bjalo, go na le tšeo leihlo la nama le ka šitwago go di lemoga ge sengwalo seo e le se setelele. Go fa mohlala, modiro wa go bala gore leina la moanegwa yo a itšego le tšwelela gakae sengwalong sa padi, e ka ba se sengwe seo leihlo le ka šitwago go se phethagatša. Ka mahlatse, sedirišwa sa go swana le Voyant[1], se ka phethagatša modiro wo ka ponyo ya leihlo le go go fa dipoelo tša go kgodiša tšeo di nepagetšego. Voyant ke sedirišwa sa inthanete seo se tsebjago kudu morerong wa dithuto tša botho tša ditšitale (Digital Humanities). Moreromogolo wa sedirišwa se ke go nolofatša modiro wa go fetleka tshedimošo, ka maikemišetšo a go fihlelela goba go tiišetša kgopolo yeo e itšego ya phatišišo. Sona se thuša go bea pepeneneng diteng tša tshedimošo ya sengwalo tšeo di ka gogago šedi ya monyakišiši ntle le go badišiša sengwalo go tloga mathomong go fihlela mafelelong. Ntlha yona ye o fetšago go e bala, re tla boela go yona mafelelong a sengwalo se gore re e otlolle.

Seswantšho se se tšwelelago ka fase, se swailwe ka ditlhaka tša go tloga go A go fihlela go E, gomme tšona di laetša tše dingwe tša dikarolo tša Voyant tše di ka dirišwago go fetleka tshedimošo. Tšona ke cirrus (A), reader (B), trends (C), summary (D) le contexts (E). Therišano ya rena e tla ithekga ka tšona dikarolo tše di boletšwego gomme ra tsopola ka boripana mešomo goba mehola ya tšona. Diteng tše di lego ka seswantšhong se sa Voyant di tšerwe kanegelong yeo e phatlaladitšwego ke Nal’ibali inthaneteng [2]. Yona e tsentšhitšwe ka go sedirišwa sa Voyant ka mokgwa wa go ngwatha diteng tša sengwalo gomme tša pharwa ka go Voyant, ke gore “cut & paste”.

Seswantšho sa Voyant

 

 

Mohola wa karolo ya cirrus yeo e swailwego ka tlhaka ya A seswantšhong, ke go hlagiša goba gona go bonagatša mantšu ao a tšwelelago gantši go feta a mangwe sengwalong. Mantšu ao a tšweleditšwe ka mebala ya go fapana ebile re lemoga gore ke a magolo kudu ge a bapetšwa le mantšu a mangwe ao a hlagišitšwego. Ge o batametša ntlhanakhomphuthara (mouse pointer) kgauswi le mantšu ao a hwetšagalago ka go cirrus, o tla kgona go bona gore lentšu leo le tšwelela gakae swengwalong seo. Gona fao, re lemoga gore lentšu le “Temo” ke le lengwe la mantšu ao a tšwelelelago gantši sengwalong. Se ga se makatše ka ge moanegwathwadi wa kanegelo e le Temo, gomme ditiragalo tša kanegelo di dikuloga godimo ga gagwe. Seo ke sona se hlolago gore a fele a tšwelela kgafetša kanegelong ka ge e le moanegwathwadi.

Ge re gatela pele, karolo ya reader, yeo e swailwego ka tlhaka ya B seswantšhong sa rena, yona mohola wa yona ke go kgontšha monyakišiši (goba modiriši wa Voyant) go bala sengwalo go ya le ka mokgwa wo se tšwelelago ka gona. Ka mantšu a mangwe, diteng tša kanegelo di alwa go tloga mathomong go fihla mafelelong go ya le ka mokgwa wo di tšwelelago ka gona sengwalong. Go swana le karolong ya cirrus, ge o batametša ntlhanakhomphuthara godimo ga lentšu le lengwe le le lengwe, go tšwelela tshedimošo ya go laetša gore lentšu leo le tšwelela gakae sengwalong.

Karolo ya trends, ye e laeditšwego ka tlhaka ya C, ke kerafo ye e laetšago bontši bja mantšu go ya le ka moo a tšwelelago dikarolong tša sengwalo. Mebala ya methalokerafo e nyalelana le ya mantšu ao a tšwelelago ka go cirrus. Le ge a mangwe a mantšu ao a tšwelelago go feta a mangwe a ka tšweletšwa ka mebala ya go swana, bjale ka ge re bona mantšu[3] a “a” le “ba” a tšweletšwa ka talalerata seswantšhong, ge o batametša ntlhanakhomphuthara godimo ga mothalokerafo, o tla hwetša tshedimošo ka botlalo mabapi le lentšu le mothalokerafo o le emetšego. Go feta fao, go na le mapokisana ka godimo ga methalokerafo ao a bontšhago gore mebala ya methalokerafo e emetše mantšu afe.

Karolo ya D, yeo e lego summary, yona ke kakaretšo ya tshedimošo ya sengwalo ka ge e itlhaloša. E laetša palomoka ya dingwalwa tše di fetlekwago (ke se setee mo lebakeng le), palomoka ya mantšu ao a hwetšwago sengwalong (1 092), nako yeo tshedimošo e tsentšhitšwego ka go sedirišwa sa Voyant (e hlamilwe gona bjale), gammogo le palomoka ya mantšu ao a hlamago mafoko (bontši bja mafoko a sengwalo se a bopša ke mantšu a 13). Ka go le lengwe, contexts ye e swailwego ka tlhaka ya E le yona e laetša lentšu le le tšwelelago ka bontši. Ga go felele fao, se se ikgethilego ka yona ke gore e laetša sekafoko se se tšwelelago ka go la nngele le sa ka go la go ja ga lentšu leo. Ka go realo, e laetša tšhomišo ya lentšu leo ka go utulla dikafoko tše di panago mmogo le lona lentšu leo.

Go na le dikarolo tše dingwe tša Voyant tšeo re sego ra bolela ka tšona sengwalong se, ebile o hlohleletšwa go ikgwathela tšona maitekelong a gago a go ithuta le go šomiša Voyant. Ke tše dintše tše Voyant e ka di dirago, eupša go na le tšeo e ka se go direlego tšona. Se se re bušetša ntlheng ye e kgwathilwego matsenong mabapi le bokgoni bja Voyant bja go utullela monyakišiši tshedimošo ya go tanya šedi ntle le go badišiša sengwalo go tloga mathomong go fihla mafelelong. Voyant e kgona go bala palo ya mantšu sengwallong eupša seo e ka se kego ya se dira ke go go utullela moko goba morero wa sengwalo. Se se laetša gabotse gore go tloga go le bohlokwa gore o ipalele sengwalo le go kwešiša diteng tša sona gore o kgone go hlatholla tswalano magareng ga mantšu ao a tšwelelago kudu sengwalong bjale ka ge go tšweleditšwe seswantšhong. Ka mantšu a mangwe, Voyant e ka se go direle phatišišo eupša go molaleng gore e tla go nolofaletša morero wa go dira phatišišo. Ge o na le kgahlego ya go ithuta kutšwana ka ga Voyant, o se ke wa diega go ikgokaganya le SADiLaR gore o dire kgopelo ya go fiwa tlhahlo ntle le tefo. Gore o hwetše tshedimošo ka botlalo mabapi le go dira kgopelo ya tlhahlo ya Voyant go tšwa go SADiLaR, kgotla mo.

 

 

[1] Gore o fihlelele sedirišwa sa Voyant, kgotla mo https://voyant-tools.org/

[2] O ka fihlelela sengwalo se go bolelwago ka sona go https://nalibali.org/story-library/multilingual-stories/temo-le-mahodu-dimela

[3] Lemoga gore mo sengwalong se, lereo le mantšu/lentšu le šomišwa ntle le go šetša gore ke karolo efe ya sehlophantšu goba mohuta ofe wa sekantšu. 


 

Voyant as enabler tool for text analysis

Research process involves scrutinising texts with the purpose of achieving a particular aim. However, there are certain tasks which might not be easily achieved manually if the text is too long. As an example, counting how many times the name of a particular character appears in the novel can be a daunting task. Fortunately, Voyant[1] can perform this task in the blink of an eye and provide convincing, credible and accurate results. Voyant is an internet application that is well known in Digital Humanities. This application is mainly used for text analysis, with the aim of achieving or strengthening a research idea. The application helps to visualise texts which might be of interest to the researcher without having to go through the entire document. This point will be elucidated in the conclusion of this discussion.

The illustration below is labelled AE and shows some of Voyant’s features that can be used to analyse text, i.e. cirrus (A), reader (B), trends (C), summary (D) and contexts (E). Our discussion will be based on these features and we will briefly explain their functions. The contents in the Voyant illustration were taken from a Nal’ibali[2] source published on the internet. It was inserted in the Voyant application by means of cutting the selected texts of the document and pasting them into Voyant.

 

Voyant illustration

The function of the cirrus feature, marked A on the illustration, is to visualise words that appear more frequently than others in the document. The words have been highlighted in different colours and the observation is that they are even bigger when compared to other words presented above. When you drag the mouse pointer closer to the words that are found in the cirrus, you can see how many times a word appears in the document. Looking at the illustration, it is clear that the word “Temo” is one of the words that appears more frequently in the document. This is not surprising as the main character of the story is Temo, and the events in the story revolve around her.

The function of the reader feature, marked B on our illustration, is to enable the researcher (or the Voyant user) to read the document as it appears in its original form. In other words, the contents of the story are outlined according to how they appear from beginning to end in the document. If you move the mouse pointer over a word, additional information appears that shows how many times the word appears in the document.

The trends feature, marked C, is a graph that shows the distribution of word frequency, i.e. how often words appear in the sections of the document. The colours of the graph lines are associated with words that appear in the cirrus. Although some of the frequent words are presented in the same colour, as we can see with words[3] such as “o” and “ba” appearing in light blue on the illustration, if you drag the mouse pointer over the graph line, you find further details regarding the word that is represented by the graph line. Moreover, there are boxes above the graph lines showing which words are represented by different colours of the graph lines.

Feature D, which is summary, is an overview of the information in the document. It shows the total number of documents that are being analysed (currently there is one), the total number of words found in the document (1 092), when the information was uploaded in the Voyant tool (has just been created), and the average number of words used to create sentences (most of the sentences in the document are formed by 13 words). On the other hand, the contexts, marked E, also shows the most frequent words. What stands out about it is that it shows the phrases on either side of that word. Therefore, it shows the context of the word by including phrases which are in combination with that word.

There are other Voyant features which were not mentioned in this discussion, therefore you are encouraged to click on them yourselves as you explore and learn how to use Voyant. There is a lot that Voyant can do, but it also has its limitations. This takes us back to the point we touched on in the introduction regarding Voyant’s efficiency to visualise text which might be of interest to the researcher without having to re-read the document from beginning to end. Voyant can count the number of words in a document, but it cannot reveal the main purpose or theme of a text. This clearly shows that it is very important that you should read the text and understand the content so that you may be able to describe the relation between words appearing frequently in the text. In other words, Voyant cannot do research on your behalf, but it can facilitate the process of doing research. If you are interested to learn more about Voyant, do not hesitate to contact SADiLaR. For more information regarding training on Voyant from SADiLaR, click here.

 

[1] To access the Voyant application, click here https://voyant-tools.org/

[2] You may access the aforementioned document on https://nalibali.org/story-library/multilingual-stories/temo-le-mahodu-dimela

[3] Note that in this text, the term word/words has been used without taking into account the parts of speech or morphemes of the forms mentioned.