geoffreyrockwell

History of Humanities Computing

 Uncategorized  Comments Off
Mar 012013
 

Stéfan and I were discussing the Canadian history of humanities computing and text analysis. Is it true that there is a tradition of text analysis tool development in Canada? Has this been an area of strength? How would we answer this question.

I have put together a wiki of information on the Canadian capacity in this area:

http://tapor.ualberta.ca/taporwiki/index.php/The_Academic_Capacity_of_the_Digital_Humanities_in_Canada

A good history of humanities computing in Canada has yet to be written.

Geoffrey R.

Feb 222013
 

We are trying to figure out how to integrate the following three useful components:

  • Glossary - We have a glossary of terms now, should it be expanded?
  • Appendix on Text Preparation - We need to provide more information on how to find and prepare a text. We have a draft appendix, but need to develop it. My sense is that a lot of problems occur in the text preparation.
  • Recipes - Over the years we have created recipes and outlines for workshops. We need to figure out how to weave those into the book. We don't want to overwhelm the book with information that might get dated quickly, but we would like this to genuinely help people try it out.

Cool New Tool

 Uncategorized  Comments Off
Feb 222013
 

Stéfan has created a cool new tool that you can see with the Humanist archive here:

http://voyant-tools.org/tool/TermsRadio/?corpus=humanist&stopList=stop.en.taporware.txt

This tool plays with ideas we have had about real-time analytics and animation of analytics. It lets you explore the evolution of themes in Humanist.

With this tool you can see clearly the explosion of interest in the web, but what is more interesting is what other words rose or fell with the web. How did the shift to the web affect humanists.

Some of the words that intrigued me were "department" and "London."

Nov 122012
 

One thing we need to do is to figure out the relationship of literary text analysis and linguistic text analysis. Could the two merge? Should literary build on linguistic? Some initial differences:

  • Linguists are interested in analysis of much smaller units while the lit folk are generally looking at large scale trends.
  • Linguists tend to therefore build computational tools that work on smaller units and are designed to demonstrate a theory while literary analysis tools are meant to support reading practices where the theory might be developed through multiple practices.
  • Linguists are interested in developing formal descriptions of linguistic phenomena - developing theories of language, while literary analysts are interested in understanding particular texts. In this way linguistics is more of a science while literary analysts are in the humanities.

An interesting exception is corpus linguistics which tends to work on larger corpora, though they use them to develop theories about language not literature.

Collapsing terms

 Uncategorized  Comments Off
Oct 262012
 

We have added a feature now that will collapse a set of terms in the Word Trends. That allows one to develop a group of related terms and then graph the group as a whole. This is important given the forms a word or theme can take.

Stéfan also ran Hume's Dialogues Concerning Natural Religion through Mallet and here are the topics proposed:

It hardly seems fair for only me to have homework. Here's yours: look through the list of 20 topic clusters below to see if there's anything of interest. I took the entire dialogue, distilled the nouns, broke it up into parts, and fed it through Mallet to do topic modelling.

  • 0 cause nature mind idea object effect manner work consequence difficulty word recourse quality meaning phenomenon existence way abstract appellation
  • 1 world order operation ignorance fact faculty difficulty judgement person variety action course disorder thing soul supposition solution antagonist thought
  • 2 man mankind view degree intention theology opposition state temper reflection side law vice reach virtue violence fortune passage affair
  • 3 power inference creature conjecture hypothesis author inconvenience spring rest preservation architect endowment workmanship notion industry stock volition general number
  • 4 experience man arrangement earth similarity perfection house resemblance meanIn statistics, the mean is the arithmetic average of a set of values. When used in text analysis, the set of values is the distribution of words in the source text, and the mean value the word with the occurrence rate closest to the average. For more information, see the Wikipedia. Return to Glossary. conclusion piety propriety representation situation presumption curiosity moon temerity fancy
  • 5 part universe circumstance appearance question economy present regard understanding absurdity advantage conclusion faculty plan prejudice proportion activity kind bound
  • 6 principle world reason animal thing generation origin rule vegetable vegetation machine step observation planet tree standard species system essence
  • 7 life misery pain pleasure phenomenon benevolence happiness goodness attributeAn attribute is a string of characters used to modify an HTML or XML element in conjunction with an attribute value. Attribute-attribute value pairs appear within an element, and serve to distinguish the instances of the element modified with a given attribute from other instances of that element. In the case of HTML, this is frequently used to apply CSS formatting to the text within that element. Ex: < p class="hangingindent" > In the case of XML, this may be used to apply CSS formatting and/or apply metadata to the text within that element. Ex: < book format="hardcover" > In the above examples, 'class' and 'format' are the attributes modifying < p > and < book > respectively. Return to Glossary. enemy feeling enjoyment rectitude condition death complaint health wickedness folly
  • 8 matter order form motion hypothesis thought system probability position force revolution adjustment experience moment situation elementAn element, also called a tag, is characteristically used within HTML and XML to apply characteristics (such as headings, paragraphs or user-defined categories) or metadata to a document, usually a text. Elements generally appear in matching pairs of an opening element and a closing element, with text in between. All text within an element pair is modified by that element, and one element pair may be nested inside another. In the case of HTML, elements are used to format a text directly, or as a delimiter for CSS formatting to the text within that element. An HTML paragraph element: < p >< /p > In the case of XML, elements may be also be used as a delimiter for CSS formatting to the text within that element, but its primary purpose is to apply metadata to that text. Ex: < book format="hardcover" >< /book > Both HTML and XML elements may be modified with attribute/value pairs. In the above example, format="hardcover" is the attribute/value pair modifying the element < book >. Return to Glossary. alteration instance change
  • 9 deity attributeAn attribute is a string of characters used to modify an HTML or XML element in conjunction with an attribute value. Attribute-attribute value pairs appear within an element, and serve to distinguish the instances of the element modified with a given attribute from other instances of that element. In the case of HTML, this is frequently used to apply CSS formatting to the text within that element. Ex: < p class="hangingindent" > In the case of XML, this may be used to apply CSS formatting and/or apply metadata to the text within that element. Ex: < book format="hardcover" > In the above examples, 'class' and 'format' are the attributes modifying < p > and < book > respectively. Return to Glossary. supposition production figure anthropomorphism species force respect discovery philosopher ship experiment solidity scale controversy mouth cloud surface
  • 10 religion regard dispute superstition eternity time controversy maxim terror motive artifice inclination morality dogmatist proposition suspense research oath impulse
  • 11 philosophy science scepticism mind opinion evidence reality sect education doctrine kind moral history remark light heart humour earnest life
  • 12 argument body theory art proof place kind system language foundation theism end scruple theist discourse scene air knowledge state
  • 13 principle sense philosopher inquiry passion atheist doubt sceptic certainty composition resemblance school company comprehension disposition uncertainty learning self danger
  • 14 system age truth method event point sense manner determination people satisfaction eye conversation weight turn veneration horror war triumph
  • 15 animal purpose society opinion capacity love scepticism beauty want praise attention prospect abuse energy plant impossibility concession prosperity contentment
  • 16 analogy design contrivance objection case intelligence instance occasion term invention voice parent chaos volume interest instinct structure darkness fertility
  • 16 seem to be words around argument for design
  • 17 reason sentiment species conduct influence use difference assent imagination faith structure affection case god circumstance study stroke apprehension belief
  • 18 nature spirit authority account number hand comparison name wisdom disposition eye day expression importance mercy justice master race sorrow
  • 19 necessity argument being thing succession existence time topic contradiction beginning product conception error chain chance dialogue nonexistence accident weakness

 

 

Ideas from Just What Do They Do?

 Uncategorized  Comments Off
Oct 092012
 

We have a SSHRC funded project that is looking at how just what people do with text analysis. Some of the interesting points that came up:

  • Could we use the colour and orientation in Cirrus for information - could we use colour for clusters? What would orientation then meanIn statistics, the mean is the arithmetic average of a set of values. When used in text analysis, the set of values is the distribution of words in the source text, and the mean value the word with the occurrence rate closest to the average. For more information, see the Wikipedia. Return to Glossary.?
  • How do people use existing tools? It seems people use Excel, Word, and Google as research tools. How do they use these? Could you do text analysis without specialized tools?
  • Some people want analytics integrated into text environments. What would be the best way to do that?
  • Many people don't see the big picture - they are reacting to what they are shown, but don't have ideas as to what text analysis could/should be.

 

THATcamp Kansas

 Uncategorized  Comments Off
Sep 212012
 

I (Geoffrey Rockwell) am giving a workshop on Voyant at the Kansas 2012 THATcamp.

 

 

From Concordance to Ubiquitous Analytics

 Uncategorized  Comments Off
Jun 202012
 

We have finished another chapter, the one that provides a history of text analysis from concordancesA concordance or keyword in context (KWIC) is usually represented as a list of occurrences of a word with some limited context shown (words to the left and words to the right). Here is an example that shows the occurrences of the word "dream" in A Midsummer Night's Dream in TACTweb: I.1/577.1 | Four nights will quickly dream away the time; | And I.1/578.2 Swift as a shadow, short as any dream; | Brief as the II.2/585.1 | Ay me, for pity! what a dream was here! | Lysander, III.2/591.1 this derision | Shall seem a dream and fruitless vision, | IV.1/593.1 as the fierce vexation of a dream. | But first I will IV.1/594.2 to me | That yet we sleep, we dream. Do not you think | The IV.1/594.2 rare | vision. I have had a dream, past the wit of man to IV.1/594.2 the wit of man to | say what dream it was: man is but an IV.1/594.2 he go | about to expound this dream. Methought I was--there IV.1/594.2 his heart to report, what my dream | was. I will get Peter IV.1/594.2 to write a ballad of | this dream: it shall be called IV.1/594.2 it shall be called Bottom's dream, | because it hath no V.1/599.1 | Following darkness like a dream, | Now are frolic: not a V.1/599.2 theme, | No more yielding but a dream, | Gentles, do not See also the definition at Wikipedia. Return to Glossary. to ubiquitous analytics. Hurrah!

From Concordance to Ubiquitous Analytics

 Uncategorized  Comments Off
Jun 202012
 

We have finished another chapter, the one that provides a history of text analysis from concordancesA concordance or keyword in context (KWIC) is usually represented as a list of occurrences of a word with some limited context shown (words to the left and words to the right). Here is an example that shows the occurrences of the word "dream" in A Midsummer Night's Dream in TACTweb: I.1/577.1 | Four nights will quickly dream away the time; | And I.1/578.2 Swift as a shadow, short as any dream; | Brief as the II.2/585.1 | Ay me, for pity! what a dream was here! | Lysander, III.2/591.1 this derision | Shall seem a dream and fruitless vision, | IV.1/593.1 as the fierce vexation of a dream. | But first I will IV.1/594.2 to me | That yet we sleep, we dream. Do not you think | The IV.1/594.2 rare | vision. I have had a dream, past the wit of man to IV.1/594.2 the wit of man to | say what dream it was: man is but an IV.1/594.2 he go | about to expound this dream. Methought I was--there IV.1/594.2 his heart to report, what my dream | was. I will get Peter IV.1/594.2 to write a ballad of | this dream: it shall be called IV.1/594.2 it shall be called Bottom's dream, | because it hath no V.1/599.1 | Following darkness like a dream, | Now are frolic: not a V.1/599.2 theme, | No more yielding but a dream, | Gentles, do not See also the definition at Wikipedia. Return to Glossary. to ubiquitous analytics. Hurrah!