<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-19974449</id><updated>2011-11-14T12:50:56.290-05:00</updated><title type='text'>Digital History Hacks (2005-08)</title><subtitle type='html'>Methodology for the infinite archive.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://digitalhistoryhacks.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default?start-index=101&amp;max-results=100'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>165</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-19974449.post-8700322530083964177</id><published>2011-02-10T09:04:00.002-05:00</published><updated>2011-02-10T09:05:33.722-05:00</updated><title type='text'>GitHub Source Code Repository</title><content type='html'>There is now a &lt;a href="https://github.com/williamjturkel/Digital-History-Hacks--2005-08-"&gt;GitHub source code&lt;/a&gt; repository for all of the code from &lt;span style="font-style:italic;"&gt;Digital History Hacks&lt;/span&gt;.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8700322530083964177?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8700322530083964177'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8700322530083964177'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2011/02/github-source-code-repository.html' title='GitHub Source Code Repository'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-9168326214562562904</id><published>2008-12-29T15:16:00.002-05:00</published><updated>2008-12-29T15:30:55.633-05:00</updated><title type='text'>Coda</title><content type='html'>When I began this blog, I had the idea that it would be an integral part of my critical and reflective technical practice.  For the past three years, it has served admirably, providing an easy way to share ideas and code and putting me in touch with a wide range of colleagues and new friends.  During that time I've tried to stay true to the promise of "hacks," even if I pushed the boundaries of both "digital" and "history".  As my technical work has evolved, however, I've begun to feel like this blog is less and less suited to my day-to-day activities.  Rather than try and force it to fit, I've decided to build something new.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-9168326214562562904?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9168326214562562904'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9168326214562562904'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/12/coda.html' title='Coda'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5400156774772230187</id><published>2008-12-09T10:36:00.011-05:00</published><updated>2008-12-09T12:50:21.772-05:00</updated><title type='text'>Some Winter Reading for Humanist Makers</title><content type='html'>(Crossposted to Cliopatria &amp;amp; Digital History Hacks)&lt;br /&gt;&lt;br /&gt;In December 2004, I bought a copy of Joe Martin's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Tabletop-Machining-Joe-Martin/dp/0966543300/"&gt;Tabletop Machining&lt;/a&gt;&lt;/span&gt; to see what would be involved in learning how to make clockwork mechanisms and automata.  It was pretty obvious that I had many years of study ahead of me, but I had just finished my PhD and knew that publishing that would take a few years more.  So I didn't mind beginning something else that might take ten or fifteen years to master.  Since then, I've been reading steadily about making things, but it wasn't until this past fall that I actually had the chance to set up a small &lt;a href="http://digitalhistory.wikispot.org/Lab_for_Humanistic_Fabrication"&gt;Lab for Humanistic Fabrication&lt;/a&gt; and begin making stuff in earnest.  Since it's December again, I thought I'd put together a small list of books to help other would-be humanist makers.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Alexander, Christopher. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Notes-Synthesis-Form-Harvard-Paperbacks/dp/0674627512/"&gt;Notes on the Synthesis of Form&lt;/a&gt;&lt;/span&gt; (Harvard, 1964).&lt;/li&gt;&lt;li&gt;Ball, Philip. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Made-Measure-Materials-21st-Century/dp/0691009759/"&gt;Made to Measure: New Materials for the 21st Century&lt;/a&gt;&lt;/span&gt; (Princeton, 1999).&lt;/li&gt;&lt;li&gt;Barrett, William. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Illusion-Technique-William-Barrett/dp/0385112025/"&gt;The Illusion of Technique&lt;/a&gt;&lt;/span&gt; (Anchor, 1979).&lt;/li&gt;&lt;li&gt;Basalla, George. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Evolution-Technology-Cambridge-Studies-History/dp/0521296811/"&gt;The Evolution of Technology&lt;/a&gt;&lt;/span&gt; (Cambridge, 1989).&lt;/li&gt;&lt;li&gt;Bryant, John and Chris Sangwin. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/How-Round-Your-Circle-Engineering/dp/069113118X/"&gt;How Round is Your Circle? Where Engineering and Mathematics Meet&lt;/a&gt;&lt;/span&gt; (Princeton, 2008).&lt;/li&gt;&lt;li&gt;Dourish, Paul. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Where-Action-Foundations-Embodied-Interaction/dp/0262541785/"&gt;Where the Action Is: The Foundations of Embodied Interaction&lt;/a&gt;&lt;/span&gt; (MIT, 2004).&lt;/li&gt;&lt;li&gt;Edgerton, David. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Shock-Old-Technology-Global-History/dp/0195322835/"&gt;The Shock of the Old: Technology and Global History since 1900&lt;/a&gt;&lt;/span&gt; (Oxford, 2006).&lt;/li&gt;&lt;li&gt;Frauenfelder, Mark and Gareth Branwyn. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Best-MAKE-Mark-Frauenfelder/dp/059651428X/"&gt;The Best of MAKE&lt;/a&gt;&lt;/span&gt; (O'Reilly, 2007).&lt;/li&gt;&lt;li&gt;Gershenfeld, Neil. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Fab-Revolution-Desktop-Computers-Fabrication/dp/0465027466/"&gt;Fab: The Coming Revolution on Your Desktop--from Personal Computers to Personal Fabrication&lt;/a&gt;&lt;/span&gt; (Basic, 2007).&lt;/li&gt;&lt;li&gt;Gordon, J. E. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Structures-Things-Dont-Fall-Down/dp/0306812835/"&gt;Structures: Or Why Things Don't Fall Down&lt;/a&gt;&lt;/span&gt; (Da Capo, 2003).&lt;/li&gt;&lt;li&gt;Gordon, J. E. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Science-Materials-through-Princeton-Library/dp/0691125481/"&gt;The New Science of Strong Materials: Or Why You Don't Fall through the Floor&lt;/a&gt;&lt;/span&gt; (Princeton, 2006).&lt;/li&gt;&lt;li&gt;Harper, Douglas. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Working-Knowledge-Skill-Community-Small/dp/0226316882/"&gt;Working Knowledge: Skill and Community in a Small Shop&lt;/a&gt;&lt;/span&gt; (Chicago, 1987).&lt;/li&gt;&lt;li&gt;Igoe, Tom. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Making-Things-Talk-Practical-Connecting/dp/0596510519/"&gt;Making things Talk: Practical Methods for Connecting Physical Objects&lt;/a&gt;&lt;/span&gt; (Make Books, 2007).&lt;/li&gt;&lt;li&gt;Ingold, Tim. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Perception-Environment-Essays-Livelihood-Dwelling/dp/0415228328/"&gt;The Perception of the Environment: Essays on Livelihood, Dwelling and Skill&lt;/a&gt;&lt;/span&gt; (Routledge, 2000).&lt;/li&gt;&lt;li&gt;Marlow, Frank M. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Machine-Shop-Essentials-Questions-Answers/dp/0975996304/"&gt;Machine Shop Essentials&lt;/a&gt;&lt;/span&gt; (Metal Arts, 2004).&lt;/li&gt;&lt;li&gt;Martin, Joe. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Tabletop-Machining-Joe-Martin/dp/0966543300/"&gt;Tabletop Machining&lt;/a&gt;&lt;/span&gt; (Sherline, 1998).&lt;/li&gt;&lt;li&gt;McDonough, William and Michael Braungart. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Cradle-Remaking-Way-Make-Things/dp/0865475873/"&gt;Cradle to Cradle: Remaking the Way We Make Things&lt;/a&gt;&lt;/span&gt; (North Point, 2002).&lt;/li&gt;&lt;li&gt;Molotch, Harvey. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Where-Stuff-Comes-Toasters-Computers/dp/0415950422/"&gt;Where Stuff Comes From: How Toasters, Toilets, Cars, Computers and Many Other Things Come to Be As They Are&lt;/a&gt;&lt;/span&gt; (Routledge, 2005).&lt;/li&gt;&lt;li&gt;Mims, Forrest M., III. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Electronic-Sensor-Circuits-Projects-Forrest/dp/0945053312/"&gt;Electronic Sensor Circuits and Projects&lt;/a&gt;&lt;/span&gt; (Master Publishing, 2004).&lt;/li&gt;&lt;li&gt;Mims, Forrest M., III. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Science-Communication-Circuits-Projects-Forrest/dp/0945053320/"&gt;Science and Communication Circuits and Projects&lt;/a&gt;&lt;/span&gt; (Master Publishing, 2004).&lt;/li&gt;&lt;li&gt;Napier, John. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Hands-John-Napier/dp/0691025479/"&gt;Hands&lt;/a&gt;&lt;/span&gt; (Princeton, 1993).&lt;/li&gt;&lt;li&gt;Oberg, Erik, et al. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Machinerys-Handbook-Toolbox-Oberg/dp/0831128003/"&gt;Machinery's Handbook&lt;/a&gt;&lt;/span&gt;, 28th ed. (Industrial Press, 2008).&lt;/li&gt;&lt;li&gt;O'Sullivan, Dan and Tom Igoe. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Physical-Computing-Sensing-Controlling-Computers/dp/159200346X/"&gt;Physical Computing: Sensing and Controlling the Physical World with Computers&lt;/a&gt;&lt;/span&gt; (Thomson, 2004).&lt;/li&gt;&lt;li&gt;Polanyi, Michael. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Personal-Knowledge-Towards-Post-Critical-Philosophy/dp/0226672883/"&gt;Personal Knowledge&lt;/a&gt;&lt;/span&gt; (Chicago, 1974).&lt;/li&gt;&lt;li&gt;Powell, John. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Survival-Fitter-John-Powell/dp/1853393169/"&gt;The Survival of the Fitter&lt;/a&gt;&lt;/span&gt; (Practical Action, 1995).&lt;/li&gt;&lt;li&gt;Pye, David. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Nature-Art-Workmanship-David-Pye/dp/0713689315/"&gt;The Nature and Art of Workmanship&lt;/a&gt;&lt;/span&gt; (A&amp;amp;C Black, 2008).&lt;/li&gt;&lt;li&gt;Rathje, William and Cullen Murphy. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Rubbish-Archaeology-Garbage-William-Rathje/dp/0816521433/"&gt;Rubbish! The Archaeology of Garbage&lt;/a&gt;&lt;/span&gt; (University of Arizona, 2001).&lt;/li&gt;&lt;li&gt;Schon, Donald A. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Reflective-Practitioner-Professionals-Think-Action/dp/0465068782/"&gt;The Reflective Practitioner: How Professionals Think in Action&lt;/a&gt;&lt;/span&gt; (Basic, 1984).&lt;/li&gt;&lt;li&gt;Sennett, Richard. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Craftsman-Richard-Sennett/dp/0300119097/"&gt;The Craftsman&lt;/a&gt;&lt;/span&gt; (Yale, 2008).&lt;/li&gt;&lt;li&gt;Slade, Giles. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Made-Break-Technology-Obsolescence-America/dp/0674025725/"&gt;Made to Break: Technology and Obsolescence in America&lt;/a&gt;&lt;/span&gt; (Harvard, 2007).&lt;/li&gt;&lt;li&gt;Sterling, Bruce. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Shaping-Things-Mediaworks-Pamphlets-Sterling/dp/0262693267/"&gt;Shaping Things&lt;/a&gt;&lt;/span&gt; (MIT, 2005).&lt;/li&gt;&lt;li&gt;Suchman, Lucy. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Human-Machine-Reconfigurations-Cognitive-Computational-Perspectives/dp/052167588X/"&gt;Human-Machine Reconfigurations: Plans and Situated Action&lt;/a&gt;&lt;/span&gt; (Cambridge, 2006).&lt;/li&gt;&lt;li&gt;Thackara, John. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Bubble-Designing-Complex-World/dp/0262701154/"&gt;In the Bubble: Designing in a Complex World&lt;/a&gt;&lt;/span&gt; (MIT, 2006).&lt;/li&gt;&lt;li&gt;Thompson, Rob. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Manufacturing-Processes-Design-Professionals-Thompson/dp/0500513759"&gt;Manufacturing Processes for Design Professionals&lt;/a&gt;&lt;/span&gt; (Thames &amp;amp; Hudson, 2007).&lt;/li&gt;&lt;li&gt;Woodbury, Robert S. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Studies-History-Machine-Robert-Woodbury/dp/0262730332/"&gt;Studies in the History of Machine Tools&lt;/a&gt;&lt;/span&gt; (MIT, 1973).&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/bricolage" rel="tag"&gt;bricolage&lt;/a&gt; | &lt;a href="http://technorati.com/tag/critical+technical+practice" rel="tag"&gt;critical technical practice&lt;/a&gt; | &lt;a href="http://technorati.com/tag/diy" rel="tag"&gt;DIY&lt;/a&gt; | &lt;a href="http://technorati.com/tag/fabrication" rel="tag"&gt;fabrication&lt;/a&gt; | &lt;a href="http://technorati.com/tag/humanism" rel="tag"&gt;humanism&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5400156774772230187?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5400156774772230187'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5400156774772230187'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/12/some-winter-reading-for-humanist-makers.html' title='Some Winter Reading for Humanist Makers'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5019291402332799556</id><published>2008-11-21T16:24:00.014-05:00</published><updated>2008-11-21T18:22:51.803-05:00</updated><title type='text'>A Few Arguments for Humanistic Fabrication</title><content type='html'>By hooking a computer up to a machine that can add, remove, cut or fuse material, it is possible to turn a digital representation into a physical object.  Most historians (at least ones reading this blog) are probably familiar with the idea of digitization; think of this as 'materialization', a reversal of the process.  The humble printer is a kind of materializer for two-dimensional text and images.  These other machines (often referred to as rapid prototyping or computer-aided manufacturing machines, or even 'replicators') allow their users to make manifest three-dimensional objects of plastic, wood, metal, or fancier composites.&lt;br /&gt;&lt;br /&gt;Over the past few years, the price of rapid fabrication has been dropping, well, rapidly.  A lab that once cost hundreds of thousands or millions of dollars can now be had for less than $20,000.  Enthusiasts predict that the age of desktop fabrication is nigh; in the next few years we will all have devices on our desks that can print out 3D objects.  (Neil Gershenfeld's &lt;a href="http://www.amazon.com/Fab-Revolution-Desktop-Computers-Fabrication/dp/0465027466/"&gt;&lt;span style="font-style:italic;"&gt;Fab&lt;/span&gt;&lt;/a&gt; is a good introduction to some of the possibilities.)  Small groups of DIY makers and hardware hackers are busy in their garages and attics trying to create &lt;a href="http://reprap.org/bin/view/Main/WebHome"&gt;a printer that can print a copy of itself&lt;/a&gt;, a machine that can &lt;a href="http://fabathome.org/wiki/index.php?title=Main_Page"&gt;print out a flashlight&lt;/a&gt;, one that can &lt;a href="http://candyfab.org/"&gt;print a torroidal coil of candy&lt;/a&gt;, or &lt;a href="http://www.evilmadscientist.com/article.php/cnctoast"&gt;burn a message into your morning toast&lt;/a&gt;.  The popular appeal of all this activity is clear in the pages of &lt;span style="font-style:italic;"&gt;&lt;a href="http://makezine.com/"&gt;MAKE&lt;/a&gt;&lt;/span&gt; magazine, or in the Discovery Channel's new show, "&lt;a href="http://dsc.discovery.com/tv/prototype-this/prototype-this.html"&gt;Prototype This&lt;/a&gt;".&lt;br /&gt;&lt;br /&gt;There are a number of reasons why historians and other humanists should be getting involved in desktop fabrication right now.  Here are a few.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;&lt;span style="font-weight:bold;"&gt;We can't predict the future&lt;/span&gt;&lt;/span&gt;.  In the 1960s, for example, it wasn't clear to everyone that there would ever be much reason for individuals to have the undivided attention of a single computer (never mind the dozens that we each now monopolize without thinking about it.)  In retrospect, the people who struggled to get individual access to computers, who bought them from mail-order catalogs and built them at home, who taught themselves how to program even when that meant reading thick manuals and punching cards... well, now we know how that turned out.  Using a computer-controlled soldering iron to fuse grains of sugar into candy sculptures may seem a bit tangential to the serious business of academia, but it's really too soon to judge.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;&lt;span style="font-style:italic;"&gt;Mind and hand&lt;/span&gt;&lt;/span&gt;.  Just because the separation between thinking and making is longstanding and well-entrenched doesn't make it a good idea.  At various times in the past, humanists have been deeply involved in making stuff: Archimedes, the Banu Musa brothers, da Vinci, Vaucanson, the Lunar Men, Bauhaus, W. Grey Walter, Gordon Mumma.  The list could easily be multiplied into every time and place, but the main point is that getting your hands dirty might be worthwhile, even if you're not da Vinci.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;&lt;span style="font-style:italic;"&gt;Historic experimentation&lt;/span&gt;&lt;/span&gt;.  People who work with material culture, the history of technology or experimental archaeology know that you can learn a lot about the past by handling physical stuff.  Until recently, that usually meant that you needed to have direct access to the stuff itself.  Now it is possible to fabricate physical models or artifacts that share properties with possibly rare or priceless originals.  Paleontologists and zooarchaeologists can learn from &lt;a href="http://visualizingevolution.blogspot.com/2008/07/paleontology-in-3d.html"&gt;3D printouts of bones and fossils&lt;/a&gt;.  Historians of science can more readily &lt;a href="http://stuff.mit.edu/afs/athena/course/other/sts.023/www/syllabus_01.pdf"&gt;replicate past experiments&lt;/a&gt;.  And so on.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;&lt;span style="font-weight:bold;"&gt;Tangible / haptic history&lt;/span&gt;&lt;/span&gt;.  More generally, it will become possible to materialize shapes, surfaces, textures and artifacts that resemble those of the past, and that can be touched, felt, handled, and manipulated.  It is easy to imagine a new tangible or haptic history that follows and extends the sensory histories that are being written right now. &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;&lt;span style="font-style:italic;"&gt;Critical technical practice&lt;/span&gt;&lt;/span&gt;.  In the late 1990s, Philip Agre argued for a mode of research that involved both "the craft work of design and ... the reflexive work of critique."  The benefits of this approach are already apparent in the digital humanities, where historians, anthropologists, archaeologists, artists, literary and media scholars, and their colleagues are busy both creating and critiquing digital sources.  Why not extend this practice to rapid fabrication, microelectronics, new materials, robotics or nanotechnology?&lt;br /&gt;&lt;br /&gt;Some of the barriers are easily overcome.  When someone asks me why a historian would need an 8-axis CNC milling machine or an oscilloscope, I say, "Why not?"  The limitations of our physical spaces can be more difficult to circumvent.  Most of the teaching and research environments available to humanists at my university are designed to support solitary or small-group office work.  These spaces are almost comically unsuitable for the kinds of things I try to do with my students: soldering, moldmaking and casting, building and lighting physical exhibits, programming in groups, creating displays or signage.  Although I could afford to purchase a laser cutter, I can't vent the poisonous fumes from my workspace.  Cutting wood with power tools will set off the fire alarm.  I certainly couldn't set up a little foundry to explore the bootstrapping process that led from metal casting to machine tools.  There isn't even anywhere to lock up student project prototypes so they won't be stolen or vandalized.  When I have a chance to talk to planners or people purchasing furniture or whatever, I ask them to imagine spaces that are appropriate for an art class or a shop class: high ceiling, natural light, plenty of ventilation, cement flooring, workbenches on casters, locking cabinets, big blank walls that you can hang things on.  No carpeting, no beige cubicles, no coffee tables with plants.  Humanists won't be able to think of themselves as makers until we create spaces for them to make things in.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/bricolage" rel="tag"&gt;bricolage&lt;/a&gt; | &lt;a href="http://technorati.com/tag/diy" rel="tag"&gt;DIY&lt;/a&gt; | &lt;a href="http://technorati.com/tag/fabrication" rel="tag"&gt;fabrication&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/physical+computing" rel="tag"&gt;physical computing&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5019291402332799556?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5019291402332799556'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5019291402332799556'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/11/few-arguments-for-humanistic.html' title='A Few Arguments for Humanistic Fabrication'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-2114880025266492079</id><published>2008-11-08T11:41:00.002-05:00</published><updated>2008-11-08T11:47:24.682-05:00</updated><title type='text'>Hemlines and History Appliances</title><content type='html'>[Crossposted to Cliopatria &amp; Digital History Hacks]&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://www.vacuumwoman.com/MediaWorks/Stock/stock.html"&gt;Stock Market Skirt&lt;/a&gt; is a robot of sorts.  Created a number of years ago by Toronto-based media artist &lt;a href="http://www.vacuumwoman.com/"&gt;Nancy Patterson&lt;/a&gt;, it consists of a party dress on a dressmaker's mannequin and a number of monitors displaying stock tickers.  As prices fluctuate, "these values are sent to a program which determines whether to raise or lower the hemline via a stepper motor and a system of cables, weights and pulleys attached to the underside of the skirt. When the stock price rises, the hemline is raised; when the stock price falls, the hemline is lowered."  I can only assume that the edge of the dress is rumpled up on the floor these days, and that the motors are somewhat the worse for wear.&lt;br /&gt;&lt;br /&gt;The exhibit, of course, is a playful reinterpretation of George Taylor's hemline index.  In the 1920s, Taylor, an economist at the Wharton school, observed that skirt lengths were correlated with the state of the economy.  Since then, the observation has continued to be relatively robust, and these days has been extended into &lt;a href="http://iht.nytimes.com/articles/2008/10/19/news/19lewin.php"&gt;many other domains&lt;/a&gt;, like music and movie preferences, the water content in foods, and even the shapes of Playboy playmates.&lt;br /&gt;&lt;br /&gt;I think the stock market skirt is a great example of what I call a "history appliance."  The idea is supposed to be whimsical: what if a device could dispense historical consciousness the way a tap dispenses water?  I've found that academic historians have a much harder time entertaining this question than public historians do.  After all, the latter have a long tradition of trying to build events, exhibits and situations that communicate interpretations of the past in ways that supplement the written word.  A diorama, for example, represents the past faithfully along some dimensions, but not all.  You can do scientific tests on an artifact--if it isn't a fake, its material substance can be informative about past events.  (Ditto if it is a fake.)  You can't necessarily do scientific tests on a diorama, and yet it is possible for it to communicate information about the past veridically.&lt;br /&gt;&lt;br /&gt;For a historian, the correlation between stock prices and hemlines raises questions of agency, and we feel comfortable exploring those on paper.  Nothing foregrounds agency like a robot, however, and historians shouldn't shy away from building them into their historical interpretations.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/thing+knowledge" rel="tag"&gt;thing knowledge&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-2114880025266492079?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2114880025266492079'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2114880025266492079'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/11/hemlines-and-history-appliances.html' title='Hemlines and History Appliances'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4981107066836707153</id><published>2008-11-02T11:35:00.016-05:00</published><updated>2008-11-02T12:34:50.854-05:00</updated><title type='text'>The Bridge Goes Both Ways</title><content type='html'>This week I found myself in a somewhat unfamiliar situation.  Along with &lt;a href="http://www.history.vt.edu/faculty/shifflett.htm"&gt;Randy Shifflett&lt;/a&gt; and &lt;a href="http://www.scu.edu/cas/history/"&gt;Fabio Lopez-Lazaro&lt;/a&gt;, I was asked to represent the discipline of history at a community building meeting of the &lt;a href="http://www.likes.org.vt.edu/"&gt;LIKES&lt;/a&gt; (Living in the KnowlEdge Society) project at Virginia Tech.  There, surrounded by computer scientists, engineers and other 'hard' scientists, we had to explain some of the challenges that face people who wish to integrate computation into historical research and teaching.  In many ways, it was a return to fundamentals.  We explained that many facts about the past are readily quantified, but that doing so often misses the point.  Historical examples raised by our non-historian colleagues often focused on names and dates, and we had to tell them that the really interesting action is usually elsewhere. We reviewed ideas of contingency, counterfactual reasoning, and ambiguity.  We explained why it usually doesn't make sense to project anachronistic categories and ideas onto past situations.  We discussed the holism and methodological individualism of most researchers in our field. &lt;br /&gt;&lt;br /&gt;When asked what kind of computational tools historians and other humanists need, the best metaphor that I could come up with drew on Jim Clifford's ideas of &lt;a href="http://www.amazon.com/Routes-Travel-Translation-Twentieth-Century/dp/0674779614/"&gt;travel and translation&lt;/a&gt;.  It would be easy to make tools that quantified how many miles you traveled on your vacation, how many feet you were standing from the sculpture when you took the picture, how you rated your meal in Venice on a scale of 1 to 10 ... but it would completely miss the point.  Instead you want ways to help you translate, to capture and document your experiences, to cue your memories, to support your storytelling, to deepen your interpretations and understanding.&lt;br /&gt;&lt;br /&gt;In this blog, I've assumed that most of my audience would be historians and other humanists who are interested in exploring digital and computational techniques at a number of levels.  The LIKES meeting reminded me that the bridge goes both ways, that computer scientists, applied mathematicians, science educators and others are also interested in ways that their skills and tools might be applied in new domains.  So, for those of you coming in the other direction: welcome!  Here are a few things you might be interested to know:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Historians who are interested in quantification already know about and use spreadsheets, databases, mathematical models, computer programs and visualization.  Historians who aren't interested in quantification won't be happy with a definition of 'qualitative' that consists of "leaving the numerical scale off of the axes of your graph."&lt;/li&gt;&lt;li&gt;The best way to promote computation amongst humanists is to emphasize social and textual applications of computing, especially ones that augment the power of individuals to do research that draws on collections of cultural / heritage materials that are distributed across many different repositories.&lt;/li&gt;&lt;li&gt;Verbs, not nouns.  John Unsworth's paper on &lt;a href="http://www3.isrl.uiuc.edu/%7Eunsworth/Kings.5-00/primitives.html"&gt;scholarly primitives&lt;/a&gt; is a good place to start.&lt;/li&gt;&lt;li&gt;There are a number of good books that take a humanistic perspective while still being sensitive to the potential of  instrumental thinking.  I particularly like Philip Agre's &lt;a href="http://www.amazon.com/Computation-Human-Experience-Learning-Doing/dp/0521386039/"&gt;&lt;span style="font-style: italic;"&gt;Computation and Human Experience&lt;/span&gt;&lt;/a&gt; and Lucy Suchman's &lt;a href="http://www.amazon.com/Human-Machine-Reconfigurations-Cognitive-Computational-Perspectives/dp/052167588X/"&gt;&lt;span style="font-style: italic;"&gt;Human-Machine Reconfigurations&lt;/span&gt;&lt;/a&gt;.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history"&gt;digital history&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4981107066836707153?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4981107066836707153'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4981107066836707153'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/11/bridge-goes-both-ways.html' title='The Bridge Goes Both Ways'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7233897191277113364</id><published>2008-10-06T10:57:00.016-04:00</published><updated>2008-10-06T12:17:06.249-04:00</updated><title type='text'>The One True Language</title><content type='html'>In &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Seven-Nights-Jorge-Luis-Borges/dp/0811209059/"&gt;Seven Nights&lt;/a&gt;&lt;/span&gt;, Borges has an essay where he describes the process by which he first read the &lt;span style="font-style:italic;"&gt;Divine Comedy&lt;/span&gt;: &lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;They were very handy books, published by Dent.  They fit into my pocket.  On the left was the Italian text, and on the right a literal translation.  I devised this modus operandi: I first read a verse, a tercet, in the English prose; then I read the verse in Italian; and so on through to the end of the canto.  Then I read the whole canto in English, and finally in Italian.  With that first reading I realized that the translations were no substitute for the original text.  The translation could be, at best, a means and a stimulus for the reader to approach the original. ... Poetry is, among so many other things, an intonation, an accentuation that is often untranslatable.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;I was recently reminded of this because I decided that my digital history grad class should use the &lt;a href="http://processing.org/"&gt;Processing&lt;/a&gt; programming language for their group project.  Since I haven't programmed in the language before, I bought a couple of textbooks and sat down to read them, slowing when I needed to mentally translate unfamiliar commands into more familiar idioms.&lt;br /&gt;&lt;br /&gt;Beginning programmers often worry about which language to learn first.  Which one is the most powerful?  The most useful?  The easiest to learn?  Which one will help me to get a high paying job?  The investment of a semester or a year seems like a long time to study something when it might turn out to be the &lt;span style="font-style:italic;"&gt;wrong choice&lt;/span&gt;.  At a theoretical level, programming languages are &lt;a href="http://en.wikipedia.org/wiki/Turing_complete"&gt;deeply equivalent&lt;/a&gt;, but that is more a matter of theory than practice... because every programming language makes some things easy and some things hard.  Or in the slogan of one language, "makes easy things easy and hard things possible."  These language differences become the stuff of &lt;a href="http://www.americanscientist.org/issues/pub/the-semicolon-wars/"&gt;holy wars&lt;/a&gt;, but they shouldn't.  The best language for the job depends largely on the job.&lt;br /&gt;&lt;br /&gt;Processing, for instance, has built-in commands that make it easy to map numbers from one range of values to another.  Now this isn't something that is too difficult to program from more primitive commands; it comes up frequently enough that you learn how to do it in whatever language you're using.  But when I read the description of the Processing commands, I realized that I have implemented similar functions in almost every language that I've ever programmed in.  By choosing to make this a language primitive, the designers of Processing made it easier for beginners to do a number of different tasks, including scaling the ranges of values returned by different analog sensors (which is something my students will need to do).&lt;br /&gt;&lt;br /&gt;There's no one true language for programming any more than there is one true language for humanism, or one true wood for carpenters.  As Borges says, the intonations of poetry are often untranslatable, and it's true for code, too.  In a sense, you don't really know how to program until you're familiar with more than one language, because the essence of programming consists in knowing how to translate the idioms of one language into a more or less familiar one.  And this is something that humanists have long known: if there is a oneness and truth to language, it is to be found in the multiple practices of translation.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/programming" rel="tag"&gt;programming&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7233897191277113364?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7233897191277113364'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7233897191277113364'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/10/one-true-language.html' title='The One True Language'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5890460719540260347</id><published>2008-10-03T18:45:00.006-04:00</published><updated>2008-10-03T19:40:59.018-04:00</updated><title type='text'>Navigating Digital History</title><content type='html'>This year, one of the first slides that I put up for the new Science, Technology and Global History class that Rob MacDougall and I are teaching was a quote from Patrick Manning's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Navigating-World-History-Historians-Create/dp/1403961190/"&gt;Navigating World History&lt;/a&gt;&lt;/span&gt;:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;Navigating world history is an ambitious but limited goal, one quite distinct from the unattainable aim of "mastering" the topic. No one can learn all of world history. Anyone who pursues such a goal is sure to become lost. To strike an analogy, all those who have attempted to conquer the world have failed, but many of those who have traveled the globe have gained pleasure and expanded their understanding. (x)&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;I originally intended this forewarning as a way of managing expectations.  I figured the students wouldn't be so disappointed in me when they found out that one consequence of taking on the history of everything from the Big Bang to human extinction is that the sum of the prof's knowledge asymptotically approaches zero.  The students, however, seem to be taking my relative ignorance in stride, and the quote has mostly served to console me when I have to leave some stuff out of my lectures.&lt;br /&gt;&lt;br /&gt;I was reminded of navigation the other day when I met with a PhD student who is close to finishing her doctorate and thinking about her second project.  She wants to do something with digital sources but is having a hard time getting her bearings.  Our conversation made me realize that I didn't have a single-page "getting started" guide for people who have never seriously worked with online sources.  So here it is.&lt;br /&gt;&lt;br /&gt;1. &lt;span style="font-style:italic;"&gt;You won't be able to read everything&lt;/span&gt;.  In fact, new material on your topic will appear online faster than you can read it.  The longer you work on a topic, the more behind you will get.  It's OK, because everyone faces this problem whether they realize it or not.&lt;br /&gt;&lt;br /&gt;2. &lt;span style="font-style:italic;"&gt;The first tool you should master is the search engine&lt;/span&gt;.  Most people think that typing a word or two into the Google or Yahoo! search box is all that you need to know.  Not so!  First of all, search engines have an advanced search page that lets you focus in on your topic, exclude search terms, weight some terms more than others, limit your results to particular kinds of document, to particular sites, to date ranges, and so on.  Second, different search engines introduce different kinds of bias by ranking results differently.  You get a better view when you routinely use more than one.&lt;br /&gt;&lt;br /&gt;3. &lt;span style="font-style:italic;"&gt;You should have a strategy for information trapping&lt;/span&gt;.  An explicit search is something that you do once, but the web is constantly changing.  By using RSS feeds it is possible to set up a number of searches that run automatically and provide you with a constantly updated view of your subject.  You can learn more about the technique in Tara Calishain's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Information-Trapping-Real-Time-Research-Web/dp/0321491718/"&gt;Information Trapping&lt;/a&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;4. &lt;span style="font-style:italic;"&gt;You can organize citations right in your browser&lt;/span&gt;.  Until you start doing advanced work in digital history, you will access almost all of your online sources through your web browser.  If you use &lt;a href="http://www.zotero.org/"&gt;Zotero&lt;/a&gt;, you can keep track of those sources in your browser, too.  It really speeds up the research process.&lt;br /&gt;&lt;br /&gt;5. &lt;span style="font-style:italic;"&gt;It is possible to automate the process of downloading sources&lt;/span&gt;.  There are a number of tools that make it easy to grab large batches of online sources without having to download them one at a time.  In the Firefox browser, for example, you can use something like &lt;a href="http://www.downthemall.net/"&gt;DownThemAll&lt;/a&gt;.  Another option is GNU &lt;a href="http://www.gnu.org/software/wget/"&gt;Wget&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;6. &lt;span style="font-style:italic;"&gt;The web is not structured like a ball of spaghetti&lt;/span&gt;.  A lot of the most interesting information to be gleaned from digital sources lies in the hyperlinks leading into and out of various nodes, whether personal pages, documents, archives, institutions, or what have you.  Search engines provide some rudimentary tools for mapping these connections, but much more can be learned with more specialized tools.&lt;br /&gt;&lt;br /&gt;7. &lt;span style="font-style:italic;"&gt;Assume that what you want to know is out there, and go looking for it&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+education" rel="tag"&gt;history education&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5890460719540260347?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5890460719540260347'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5890460719540260347'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/10/navigating-digital-history.html' title='Navigating Digital History'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-6890675706863669255</id><published>2008-09-21T10:33:00.008-04:00</published><updated>2008-09-21T11:42:09.521-04:00</updated><title type='text'>Hello World!</title><content type='html'>It's traditional when learning a new programming language to have your first program simply say "Hello world!" and terminate.  It not only boosts your confidence, it signals that you've got all of the basics in place: an editor to create programs, an interpreter or compiler that can follow the instructions that you've programmed, and a way for the information to get out of the program and the computer into a form where you can make use of it.  What you do at that point is up to you... hopefully more programming.&lt;br /&gt;&lt;br /&gt;Over the last few years, I've been wrapping up my &lt;a href="http://www.amazon.com/Archive-Place-Unearthing-Chilcotin-Plateau/dp/0774813776/"&gt;first book project&lt;/a&gt;: a study of how people reconstruct the past from various kinds of physical traces.  My interest in the ways that material evidence and places can inform historical consciousness, and a growing interest in the potential of digital and public history, have led me to a related set of research questions.  How can we use new technologies like ubiquitous / pervasive computing, ambient and tangible interfaces, and desktop fabrication to build historical interpretations into physical devices and environments?  What happens when all of the bits that we've been creating through various kinds of digitization can become material atoms again?  And how can this help us to better understand various pasts and make them usable in the present?&lt;br /&gt;&lt;br /&gt;For a couple of years I've been doing projects with my students and research assistants that use technology to augment everyday places and objects, to put historical interpretations back into &lt;span style="font-style:italic;"&gt;stuff&lt;/span&gt;.  These projects have made use of &lt;a href="http://digitalhistory.wikispot.org/Place-based_Computing"&gt;GPS-enabled handheld and tablet computers&lt;/a&gt;; &lt;a href="http://digitalhistory.wikispot.org/Interactive_Ambient_and_Tangible_Devices_for_Knowledge_Mobilization"&gt;microcontrollers, analog sensors and actuators&lt;/a&gt;; and other electronic technologies.  Up until now, however, we've had to buy physical components or fashion them by hand.&lt;br /&gt;&lt;br /&gt;Last week, &lt;a href="http://adamcrymble.blogspot.com/"&gt;Adam&lt;/a&gt;, &lt;a href="http://devonelliott.blogspot.com/"&gt;Devon&lt;/a&gt; and I had a chance to set up our new &lt;a href="http://www.rolanddg.com/product/3d/3d/mdx-20_15/mdx-20_15.html"&gt;Roland Modela MDX-20&lt;/a&gt; and try making something with it, a kind of physical "hello world."&lt;br /&gt;&lt;br /&gt;&lt;a href="http://digitalhistory.wikispot.org/FabWiki_Using_MDX-20"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SNZcQ5pzt6I/AAAAAAAAAA8/cjJluVTnczg/s400/mdx20-20080919-thumbs.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5248483861170730914" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The MDX-20 is a (relatively simple) computer-controlled milling machine.  It is able to move a rapidly spinning, sharp tool in three dimensions, gradually removing material from a solid block, so that it comes to precisely resemble a three-dimensional model in the computer.  What this means is that something that is almost purely virtual can be materialized in foam, plastic, wood, and other soft physical media.  (Although our first efforts look pretty chunky, the machine is capable of much more precise contours--we have a lot to learn.)  The MDX-20 also has a scanning probe, which we haven't had a chance to test yet.  When used in scanning mode, the MDX-20 automates the creation of 3D models from physical objects.  This allows you to start with one or more objects in the real world, scan them to create 3D models, edit or remix as desired, then replicate them in material form.  At this point, the possibilities seem nearly endless.&lt;br /&gt;&lt;br /&gt;In &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Thing-Knowledge-Philosophy-Scientific-Instruments/dp/0520232496/"&gt;Thing Knowledge&lt;/a&gt;&lt;/span&gt;, the philosopher Davis Baird argues that "Things and theory can both constitute our knowledge of the world."  Things can serve as &lt;span style="font-style:italic;"&gt;models&lt;/span&gt;, physical representations that act in a similar way to theories.  They can &lt;span style="font-style:italic;"&gt;create phenomena&lt;/span&gt;, separating action "from human agency and buil[ding it] into the reliable behavior of an artifact."  Or they can serve as &lt;span style="font-style:italic;"&gt;measuring instruments&lt;/span&gt;, combining both representation and work (11-12).  There's a long tradition of ignoring things to focus on ideas, cyberspace being one of many guises for idealism.  It's time for digital humanists to say, "Hello, world!"&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digitization" rel="tag"&gt;digitization&lt;/a&gt; | &lt;a href="http://technorati.com/tag/fabrication" rel="tag"&gt;fabrication&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/stepwise+refinement" rel="tag"&gt;stepwise refinement&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-6890675706863669255?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6890675706863669255'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6890675706863669255'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/09/hello-world.html' title='Hello World!'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SNZcQ5pzt6I/AAAAAAAAAA8/cjJluVTnczg/s72-c/mdx20-20080919-thumbs.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-9219154307613498116</id><published>2008-09-06T18:20:00.008-04:00</published><updated>2008-09-06T19:08:08.348-04:00</updated><title type='text'>Practices, Not Products</title><content type='html'>The first week of school is a good time to expect to see Murphy's law in action.  This year my server suddenly decided to start falling over at diminishing intervals.  A couple of weeks of sporadic debugging have left me where I started: with an unreliable server.  All of the code and images for this blog are hosted there, so they are temporarily unavailable, and I had to scramble a bit to find a new online home for the group project that my students will be doing in their grad class in digital history.  This summer, too, the keepers of my old standby the online &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.biographi.ca/index-e.html"&gt;Dictionary of Canadian Biography&lt;/a&gt;&lt;/span&gt; suddenly decided to overhaul their website.  While I applaud the fact that they moved from Active Server Pages to PHP, I'm not so happy that so many of the code examples in &lt;span style="font-style:italic;"&gt;Digital History Hacks&lt;/span&gt; and the &lt;span style="font-style:italic;"&gt;&lt;a href="http://niche.uwo.ca/programming-historian/"&gt;Programming Historian&lt;/a&gt;&lt;/span&gt; have to be revised.&lt;br /&gt;&lt;br /&gt;I've solved my server problem for the time being, or more accurately, sidestepped it, by moving a lot of my online stuff to a new home: &lt;a href="http://digitalhistory.wikispot.org/"&gt;digitalhistory.wikispot.org&lt;/a&gt;.  In the process I was reminded again that wikis really are the fastest and most awesome way to get your stuff online in a form that is durable but plastic enough to be continually reshaped.  I can thank &lt;a href="http://raymondyee.net/wiki/"&gt;Raymond Yee&lt;/a&gt; for the inspiration.  Although I've used a number of online tools, it didn't occur to me that a wiki can replace most of them until I saw Raymond give a talk at &lt;a href="http://thatcamp.org/"&gt;THATCamp&lt;/a&gt;.  Rather than bust out an Open Office presentation or something like that, Raymond pointed his browser to his own wiki, a "working space / public knowledge repository".  He had already entered some of the material that he wanted to talk about, and as he gave his presentation he continued to edit.  When his presentation was over, he clicked 'save' and everything was already available online.&lt;br /&gt;&lt;br /&gt;The beauty of a wiki, as many people have noted, is that it allows online material to grow quickly and organically.  Rather than try to build my new online presence in one pass, I was able to sketch the outlines of what I wanted to add.  Now, every time I look at the site, I see a whole bunch of work that still needs to be done.  I can chip away at it, rethink, reorganize, and everything remains available to other people.  On some of the pages I've roughed out sections for my students or research assistants to fill in; I expect them to chip away, rethink and reorganize, too.  In effect, wiki software can provide scaffolding for practices.  There's no real final product, just the most recent edit.  (And, of course, access to the entire history of edits).&lt;br /&gt;&lt;br /&gt;This year, &lt;a href="http://www.robmacdougall.org/"&gt;Rob MacDougall&lt;/a&gt; and I are teaching a new course on science, technology and global history, and I find myself in the (exciting? unenviable?) position of writing my lectures the week before I give them.  A lot of my projects feel like they may be on hold until November, when I can hand the lecturing off to Rob and start to deal with some of the changes that have broken things that used to work.  I can't feel too bothered, however.  &lt;a href="http://digitalhistoryhacks.blogspot.com/2008/01/all-is-flux.html"&gt;All is flux&lt;/a&gt;, especially on the internet.  The trick is to find the techniques and tools that help you deal gracefully with change, to think in clay and not in stone.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/dictionary+of+canadian+biography" rel="tag"&gt;Dictionary of Canadian Biography&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/electronica" rel="tag"&gt;entropy&lt;/a&gt; | &lt;a href="http://technorati.com/tag/wikis" rel="tag"&gt;wikis&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-9219154307613498116?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9219154307613498116'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9219154307613498116'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/09/practices-not-products.html' title='Practices, Not Products'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5065221762570799272</id><published>2008-08-20T05:34:00.004-04:00</published><updated>2008-08-20T06:36:22.676-04:00</updated><title type='text'>Traces of Use</title><content type='html'>When he figured that April was the cruelest month, I think TS Eliot was off by four.  I find that the early summer stretches into an endless vista of exciting possibilities for new research and teaching.  I make far too many commitments, all of which come back to haunt me in late August.  Other than dropping in to do light maintenance, for example, I haven't had time recently to write much new material for the &lt;span style="font-style:italic;"&gt;&lt;a href="http://niche.uwo.ca/programming-historian/"&gt;Programming Historian&lt;/a&gt;&lt;/span&gt;.  The last time that I did, however, I noticed that visitor logs tell an interesting story.&lt;br /&gt;&lt;br /&gt;To date, the front page has received around 12 thousand hits, as people arrive at the site and decide what to do next.  At that point, most of them leave.  They may have ended up there by accident; they may bookmark the site to look at later.  The next two sections are prefatory.  The first (around 4 thousand visits) suggests why you may want to learn how to program.  The second (almost 5 thousand visits) tells you how to install the software that you need to get started.  My interpretation is that about a fifth of our visitors are already convinced they want to learn how to program, which I think is a good sign.  The actual programming starts in the next section (2 thousand visits) and goes from there (while the number of visitors for subsequent sections slowly drops to about a thousand each).  These numbers could be interpreted in various ways, but to me they suggest that (1) historians and other humanists want to learn how to program, (2) good intentions only get you so far, and (3) if you do stick with it, it gets harder gradually.&lt;br /&gt;&lt;br /&gt;These are pretty crude metrics, although more informative ones than I'm getting from, say, the sales figures for my award- winning- but- otherwise- neglected- monograph (&lt;a href="http://www.amazon.com/Archive-Place-Unearthing-Chilcotin-Plateau/dp/0774813776/"&gt;buy a copy today!&lt;/a&gt;)  My friends who work in psycholinguistics have much more sophisticated ways of determining how people read and understand text, with devices that track the subject's gaze and estimate the moment-by-moment contents of their short term memory.  I want people to get something out of the &lt;span style="font-style:italic;"&gt;Programming Historian&lt;/span&gt;, but I don't need that level of detail about what they're getting.&lt;br /&gt;&lt;br /&gt;In &lt;span style="font-style:italic;"&gt;The Social Life of Information&lt;/span&gt;, Brown and Duguid have an anecdote about a historian who goes through batches of eighteenth-century letters rapidly by sniffing bundles of them.  When asked what he is doing, he explains that letters written during a cholera outbreak were disinfected with vinegar.  "By sniffing for the faint traces of vinegar that survived 250 years and noting the date and source of the letters, he was able to chart the progress of cholera outbreaks."  Brown and Duguid go on to note that "Digitization could have distilled out the text of those letters. It would, though, have left behind that other interesting distillate, vinegar."&lt;br /&gt;&lt;br /&gt;Probably, but not necessarily.  Digitization simply refers to the explicit digital representation of something that can be measured.  We are content at the moment with devices that take pictures of documents, and those devices have been steadily improving.  We wouldn't be as content with the scanning quality of 2002, when &lt;span style="font-style:italic;"&gt;The Social Life of Information&lt;/span&gt; was published, and we'd, like, &lt;span style="font-style:italic;"&gt;totally hate&lt;/span&gt; the scanning quality of 1982 or 1962 ... just ask my students when they have to work with microfilm.  That said, high resolution infrared spectroscopy makes it possible to build chemical sniffers that outperform human noses.  They also make it possible to go through an archive and digitize the smells of every document.&lt;br /&gt;&lt;br /&gt;Saying that we can digitize any trace that we can discover and measure isn't the same thing as saying we can discover and measure any trace that we might need at the moment, episodes of &lt;span style="font-style:italic;"&gt;CSI&lt;/span&gt; notwithstanding. The  material world is almost infinitely informative about the past, but the traces that are preserved have nothing to do with our interests and intents.  And one shouldn't draw too fine a line between the analog and the digital, because digital representations are always stored on real-world analog devices, something Matt Kirschenbaum explores in his new book &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Mechanisms-New-Media-Forensic-Imagination/dp/0262113112/"&gt;Mechanisms&lt;/a&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/analog" rel="tag"&gt;analog&lt;/a&gt; | &lt;a href="http://technorati.com/tag/clues" rel="tag"&gt;clues&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digitization" rel="tag"&gt;digitization&lt;/a&gt; | &lt;a href="http://technorati.com/tag/representation" rel="tag"&gt;representation&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5065221762570799272?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5065221762570799272'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5065221762570799272'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/08/traces-of-use.html' title='Traces of Use'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4112561886986197738</id><published>2008-08-07T10:15:00.012-04:00</published><updated>2008-08-07T11:16:28.365-04:00</updated><title type='text'>Arms Races</title><content type='html'>[Cross-posted to Cliopatria and Digital History Hacks]&lt;br /&gt;&lt;br /&gt;Like many people who blog at Blogger, I was recently notified by e-mail that my blog had been identified by their automated classifiers "as a potential spam blog."  In order to prove that this was not the case, I had to log in to one of their servers and request that my blog be reviewed by a human being.  The e-mail went on to say "Automatic spam detection is inherently fuzzy, and occasionally a blog like yours is flagged incorrectly. We sincerely apologize for this error."  The author of the e-mail knew, of course, that if my blog were sending spam then his or her e-mail would fall on deaf ears (as it were)... you don't have to worry about bots' feelings.  The politeness was intended for me, a hapless human caught in the crossfire in &lt;a href="http://www.amazon.com/War-Intelligent-Machines-Manuel-Landa/dp/0942299752/"&gt;the war of intelligent machines&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;That same week, a lot of my e-mails were also getting bounced.  Since I have my blog address in my .sig file, I'm guessing that may have something to do with it.  Alternately, my e-mail address may have been temporarily blocked as the result of a surge in spam being sent from GMail servers.  This to-and-fro, attack against counter-attack, &lt;a href="http://www.amazon.com/Spy-vs-Complete-Casebook/dp/0823050211/"&gt;&lt;span style="font-style:italic;"&gt;Spy vs. Spy&lt;/span&gt;&lt;/a&gt; kind of thing can be irritating for the collaterally damaged but it is good news for digital historians, as paradoxical as that may seem.&lt;br /&gt;&lt;br /&gt;One of the side effects of the war on spam has been a lot of sophisticated research on automated classifiers that use Bayesian or other techniques to categorize natural language documents.  Historians can use these algorithms to make their own online archival research much more productive, as I argued in a &lt;a href="http://digitalhistoryhacks.blogspot.com/search?q=bayesian"&gt;series of posts&lt;/a&gt; this summer.  &lt;br /&gt;&lt;br /&gt;In fact, a closely related arms race is being fought at another level, one that also has important implications for the digital humanities.  The optical character recognition (OCR) software that is used to digitize paper books and documents is also being used by spammers to try and circumvent software intended to block them.  This, in turn, is having a positive effect on the development of OCR algorithms, and leading to higher quality digital repositories as a collateral benefit.  Here's how.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Computer scientists create the &lt;a href="http://en.wikipedia.org/wiki/Captcha"&gt;CAPTCHA&lt;/a&gt;, a "Completely Automated Public Turing test to tell Computers and Humans Apart."  In essence, it shows a wonky image of a short text on the screen, and the (presumably human) user has to read it and type in the characters.  If they match, the system assumes a real person is interacting with it.&lt;/li&gt;&lt;li&gt;Google releases the &lt;a href="http://google-code-updates.blogspot.com/2006/08/announcing-tesseract-ocr.html"&gt;Tesseract OCR engine&lt;/a&gt; that they use for Google Books as open source.  On the plus side, a whole community of programmers can now improve Tesseract OCR.  On the minus side, a whole community of spammers can put it to work cracking CAPTCHAs.&lt;/li&gt;&lt;li&gt;In the meantime, a group of computer scientists comes up with a brilliant idea, the &lt;a href="http://recaptcha.net/"&gt;reCAPTCHA&lt;/a&gt;.  Every day, tens of millions of people are reading wonky images of short character strings and retyping them.  Why not use all of these infinitesimal units of labor to do something useful?  The reCAPTCHA system uses OCR errors for its CAPTCHAs.  When you respond to a reCAPTCHA challenge, you're helping to improve the quality of digitized books.&lt;/li&gt;&lt;li&gt;The guys with &lt;a href="http://en.wikipedia.org/wiki/White_hat"&gt;white hats&lt;/a&gt; are also using OCR to crack CAPTCHAs, with the aim of creating stronger challenges. One side effect is that the OCR gets better at recognizing wonky text, and thus better for creating digital books.&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/optical+character+recognition" rel="tag"&gt;optical character recognition (OCR)&lt;/a&gt; | &lt;a href="http://technorati.com/tag/turing+test" rel="tag"&gt;Turing test&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4112561886986197738?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4112561886986197738'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4112561886986197738'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/08/arms-races.html' title='Arms Races'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5832964328479314020</id><published>2008-07-20T08:53:00.006-04:00</published><updated>2008-07-20T10:30:45.224-04:00</updated><title type='text'>Towards a Computational History</title><content type='html'>[Cross-posted to Cliopatria &amp;amp; Digital History Hacks]&lt;br /&gt;&lt;br /&gt;Given that relatively few of our colleagues are familiar with digital history yet--and that those of us who practice some form of it aren't sure what to call it: digital history? history and computing? digital humanities?--it may seem a bit perverse to start talking about computational history.  Nevertheless, it's an idea that we need, and the sooner we start talking and thinking about it, the better.&lt;br /&gt;&lt;br /&gt;From my perspective, digital history simply refers to the idea that many of our potential sources are now online and available on the internet.  It is possible, of course, to expand this definition and tease out many of its implications.  (For more on that, see the forthcoming interchange on "The Promise of Digital History" in the September 2008 issue of &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.journalofamericanhistory.org/"&gt;The Journal of American History&lt;/a&gt;&lt;/span&gt;).  To some extent we're all digital historians already, as it is quickly becoming impossible to imagine doing historical research without making use of e-mail, discussion lists, word processors, search engines, bibliographical databases and electronic publishing.  Some day pretty soon, the "digital" in "digital history" is going to sound redundant, and we can drop it and get back to doing what we all love.&lt;br /&gt;&lt;br /&gt;Or maybe not.  By that time, I think, it will have become apparent that having networked access to an effectively infinite archive of digital sources, and to one another, has completely changed the nature of the game.  Here are a few examples of what's in store.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Collective intelligence&lt;/span&gt;.  Social software allows large numbers of people to interact efficiently and focus on solving problems that may be too difficult for any individual or small group.  Does this sound utopian?  Present-day examples are easy to find in massive online games, open source software, and even the much-maligned &lt;span style="font-style:italic;"&gt;Wikipedia&lt;/span&gt;.  These efforts all involve unthinkably complex assemblages of people, machines, computational processes and archives of representations.  We have no idea what these collective intelligences will be capable of.  Is it possible for an ad hoc, international, multi-lingual group of people to engage in a parallel and distributed process of historical research? Is it possible for a group to transcend the historical consciousness of the individuals that make it up? How does the historical reasoning of a collective intelligence differ from the historical reasoning of more familiar kinds of historian?&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Machines as colleagues&lt;/span&gt;. Most of us are aware that law enforcement and security agencies routinely use biometric software to search through databases of images and video and identify people by facial characteristics, gait, and so on.  Nothing precludes the use of similar software with historical archives.  But here's the key point.  Suppose you have a photograph of known provenance, depicting someone in whom you have an interest.  Your biometric software skims through a database of historical images and matches your person to someone in a photo of a crowd at an important event. If the program is 95% sure that the match is valid, are you justified in arguing that your person was in the crowd that day?&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Archives with APIs&lt;/span&gt;.  Take it a step further.  Most online archives today are designed to allow human users to find sources and read and cite them in traditional ways.  It is straightforward, however, for the creators of these archives to add an application programming interface (API), a way for computer programs to request and make use of archival sources.  You could train a machine learner to recognize pictures of people, artifacts or places and turn it loose on every historical photo archive with an API.  Trained learners can be shared amongst groups of colleagues, or subject as populations to a process of artificial selection.  At present, APIs are most familiar in the form of mashups, websites that integrate data from different sources on-the-fly.  The race is on now to provide APIs for some of the world's most important online archival collections.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style:italic;"&gt;Models&lt;/span&gt;.  Agent-based and other approaches from complex adaptive systems research are beginning to infiltrate the edges of the discipline, particularly amongst researchers more inclined toward the social sciences.  Serious games appeal to a generation of researchers that grew up with not-so-serious ones.  People who might once have found quantitative history appealing are now building geographic information systems.  In every case, computational processes become tools to think with.  I was recently at the &lt;a href="http://metropolisontrial.wordpress.com/"&gt;Metropolis on Trial&lt;/a&gt; conference, loosely organized around the 120 million word online archive of the &lt;a href="http://www.oldbaileyonline.org/"&gt;Old Bailey proceedings&lt;/a&gt;.  At the conference, historians talked and argued about sources and interpretations, of course, but also about optical character recognition and statistical tables and graphs and search results generated with tools on the website.  We're not yet at a point where these discussions involve much nuanced analysis of layers of computational mediation... but it is definitely beginning.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/computational+history" rel="tag"&gt;computational history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5832964328479314020?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5832964328479314020'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5832964328479314020'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/07/towards-computational-history.html' title='Towards a Computational History'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5301736224174367288</id><published>2008-07-03T17:03:00.006-04:00</published><updated>2008-07-03T18:15:02.049-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 14</title><content type='html'>I'm off to England next week to present some of this work at the &lt;a href="http://metropolisontrial.wordpress.com/"&gt;Metropolis on Trial&lt;/a&gt; conference, so it is time to bring this series of posts to a close.  I'd like to wrap up by summarizing what we've accomplished and making a clearer case for machine learning as a tool for historical research.&lt;br /&gt;&lt;br /&gt;Papers in the machine learning literature often say something like "we tested learners &lt;span style="font-style: italic;"&gt;x&lt;/span&gt;, &lt;span style="font-style: italic;"&gt;y&lt;/span&gt;, and &lt;span style="font-style: italic;"&gt;z&lt;/span&gt; on this standard data set and found errors of 40%, 20% and 4% respectively.  Learner &lt;span style="font-style: italic;"&gt;z&lt;/span&gt; should therefore be used in this situation."  The value of such research isn't immediately apparent to the working historian.  For one thing, many of the most powerful machine learning algorithms require the learner to be given all of the training data at once.  Historians, on the other hand, tend to encounter sources piecemeal, sometimes only recognizing their significance in retrospect.  Training a machine learner usually requires a labelled data set: each item already has to be categorized.  It's not obvious what good a machine learner is, if the researcher has to do all the work in advance.  Finally, there is the troublesome matter of errors.  What good is a system that screws up one judgement in ten?  Or one in four?&lt;br /&gt;&lt;br /&gt;In this work we considered a situation that is already becoming familiar to historians.  You have access to a large archive of sources in digital form.  These may consist of raw OCR text (full of errors), or they may be edited text, or, best of all, they may be marked up with XML, as in the case of the Old Bailey trials. Since most of us are not lucky enough to work with XML-tagged sources very often, I stripped out the tags to make my case more strongly.&lt;br /&gt;&lt;br /&gt;Now suppose you know exactly what you're looking for, but no one has gone through the sources yet to create an index that you can use.  In a traditional archive, you might be limited to starting at the beginning and plowing through the documents one at a time, skimming for whatever you're interested in.  If your archive has been digitized you have another option.  You can use a traditional search engine to index the keywords in the documents.  (You could, for example, download them all to your own computer and index them with &lt;a href="http://desktop.google.com/features.html"&gt;Google Desktop&lt;/a&gt;.  Or you could get fancy with something like &lt;a href="http://lucene.apache.org/"&gt;Lucene&lt;/a&gt;.)  Unless your topic has very characteristic keywords, however, you will be getting a mix of relevant and irrelevant results with every search.  Under many conditions, a keyword search is going to return hundreds or thousands of hits, and you are back to the point of going through them one at a time.&lt;br /&gt;&lt;br /&gt;Suppose you're interested in larceny.  (To make my point, I'm picking a category that the OB team has already marked up, but the argument is valid for anything that you or anyone else can reliably pick out.  You might be studying indirect speech, or social deference, or the history of weights and measures.  As long as you can look at each document and say "yes, I'm interested in this" or "no, I'm not interested in this" you can use this technique.)  Anyway, you start with the first trial of 24 Nov 1834.  It is a burglary, so you throw it in the "no" pile.  The next record is a burglary, the third is a wounding, and so on.  After you skim through 1,000 trials, you've found 444 examples of larceny and 556 examples of trials that weren't larceny.  If you kept track of how long it took you to go through those thousand trials, you can estimate how long it will take for you to get through the remaining 11,959 trials in the 1830s, and approximately how many more cases of larceny you are likely to find.  But you're less than a tenth of the way through the decade's trials, and &lt;span style="font-style: italic;"&gt;no further ahead&lt;/span&gt; on the remaining ones.&lt;br /&gt;&lt;br /&gt;Machine learning gives you a very powerful alternative, as we saw in this series.  The naive bayesian learner isn't the most accurate or precise one available, but it has a couple of enormous advantages for our application.  First of all, it is relatively easy to understand and to implement.  Although we didn't make use of this characteristic, it is also possible to stop the learner at any point and find out which features it thinks are most significant.  Second, the naive bayesian is capable of incremental learning.  We can train it with a few labelled items, then test it on some unlabelled items, then train it some more.  Let's go back to the larceny example.  Suppose as you look at each of the thousand trials, you hand it off to your machine learner along with the label that you've assigned.  So once you decide the first trial is a burglary, you give it to the learner along with the label "no".  (This doesn't have to be laborious... the process could easily be built into your browser, so as you review a document, you can click a plus or minus button to label it for your learner.)  Where are you after 1,000 trials?  Well, you've still found your 444 examples of larceny and your 556 examples of other offence categories.  But at this point, you've also trained a learner that can look through the next 11,959 trials in a matter of seconds and give you a pile containing about 2,500 examples of larceny and about 750 false positives.  That means that the next pile of stuff that you look through has been "enriched" for your research.  Only 44% of the first thousand trials you looked at were examples of larceny.  Almost 77% of the next three thousand trials you look at will be examples of larceny, and the remaining 23% will be more closely related offences.  Since the naive bayesian is capable of online learning, you can continue to train it as you look through this next pile of data.&lt;br /&gt;&lt;br /&gt;Machine learning can be a powerful tool for historical research because&lt;br /&gt;&lt;ol&gt;&lt;li&gt;It can learn as a side effect of your research process at very little cost to you&lt;/li&gt;&lt;li&gt;You can stop the system at any point to see what it has learned, getting an independent measure of a concept of interest&lt;br /&gt;&lt;/li&gt;&lt;li&gt;You can use it at any time to "look ahead" and find items that it thinks that you will be interested in&lt;/li&gt;&lt;li&gt;Its false positive errors are often instructive, giving you a way of finding interesting things just beyond the boundaries of your categories&lt;/li&gt;&lt;li&gt;A change in the learner's performance over time might signal a historically significant change or discontinuity in your sources&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5301736224174367288?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5301736224174367288'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5301736224174367288'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/07/naive-bayesian-in-old-bailey-part-14.html' title='A Naive Bayesian in the Old Bailey, Part 14'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3601154085869624821</id><published>2008-07-01T18:15:00.006-04:00</published><updated>2008-12-29T20:02:46.786-05:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 13</title><content type='html'>So far, we've only been working with the Old Bailey trials of the 1830s, almost thirteen thousand in total.  It would be nice to know if our learner continues to perform well as we give it more testing data.  In the following runs, I trained a TFIDF-50 learner for each offence category that was attested more than 10 times in the 1830s.  The training data consisted of all of the trials from the decade, labelled and presented to the learner in chronological order.  Training was then stopped, and each learner was tested on the 25,403 unlabelled trials of the 1840s, also presented in chronological order.  In order to assess the learners' performance, I used the same measures that we developed earlier, comparing the ratio of misses to hits (accuracy) and the ratio of false positives to hits (precision).  As before, I added one to the denominator, so as not to accidentally divide by zero.  (Computers hate it when you do that.)&lt;br /&gt;&lt;br /&gt;The results for the accuracy measure are shown below, in the form of a bar graph rather than the scatterplot-style figure we used before.  In this graph and the next one, we can see that the performance of the learner is about as good for data that it hasn't seen (i.e., the 1840s trials) as it is for the data that were used to train it.  Most of the measures are around two or less, which is &lt;a href="http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-11.html"&gt;comparable to what we saw before&lt;/a&gt;.  The performance has actually improved for many of the offence categories, like assault, fraud, perjury, conspiracy, kidnapping, receiving and robbery.  We do notice, however, some performance degradation for a number of sexual offences, including sexual assault with sodomitical intent, bigamy, indecent assault, rape and sodomy.  This might be a statistical anomaly.  On the other hand, it might be a sign that the language that was used to describe sexual offences changed somewhat in the 1840s, causing a learner trained on 1830s data to miss later cases.  This is one of the ways that tools like machine learning can be used to generate new research questions.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVly9m1JSDI/AAAAAAAAANY/lxtskVaUvUM/s1600-h/ob-tfidf50-1830s-40s-missratio.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 109px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVly9m1JSDI/AAAAAAAAANY/lxtskVaUvUM/s200/ob-tfidf50-1830s-40s-missratio.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285382040417028146" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The next figure shows the results for the precision measure.  In general the learner makes more false positive errors than misses, which is exactly what we want, given that &lt;a href="http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-12.html"&gt;the false positives can be useful in themselves&lt;/a&gt;.  We don't see quite the same clear difference between sexual and non-sexual offence categories that we saw with the accuracy measure ... and for some reason it is quite hard for our learner to pick out cases of perverted justice in the 1840s.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlzIxpBeiI/AAAAAAAAANg/qGJg8XHsVUg/s1600-h/ob-tfidf50-1830s-40s-fpratio.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 117px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlzIxpBeiI/AAAAAAAAANg/qGJg8XHsVUg/s200/ob-tfidf50-1830s-40s-fpratio.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285382232297536034" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3601154085869624821?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3601154085869624821'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3601154085869624821'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/07/naive-bayesian-in-old-bailey-part-13.html' title='A Naive Bayesian in the Old Bailey, Part 13'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVly9m1JSDI/AAAAAAAAANY/lxtskVaUvUM/s72-c/ob-tfidf50-1830s-40s-missratio.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-2276003148037320539</id><published>2008-06-27T17:11:00.009-04:00</published><updated>2008-12-29T19:57:52.850-05:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 12</title><content type='html'>Up until now, we've measured the error rates of our various learners without worrying too much about what good an error-prone machine learner actually is.  By dividing the learner's responses into the four categories of &lt;span style="font-style:italic;"&gt;hit&lt;/span&gt;, &lt;span style="font-style:italic;"&gt;miss&lt;/span&gt;, &lt;span style="font-style:italic;"&gt;false positive&lt;/span&gt; and &lt;span style="font-style:italic;"&gt;correct negative&lt;/span&gt;, we can get a more nuanced picture of what it is doing when it makes a mistake.  Here we look at &lt;span style="font-style:italic;"&gt;false positives&lt;/span&gt;, trials that the learner mistakenly identifies as belonging to the category of interest.  We start by writing a program that goes through each of the TFIDF-50 learner's responses for the various offence categories in the 1830s. It collects all of the false positives, making a note of what offence category each trial actually belongs to.  The code to do this is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-test-false-positives.py.html"&gt;here&lt;/a&gt;.  We can then plot the information in a convenient form.  I've decided to use pie charts.&lt;br /&gt;&lt;br /&gt;The figure below shows the results for the offence category of assault, coded as a way of breaking the peace.  What happens when our learner thinks that a trial is an example of this category but it really isn't?  About 38.6% of the time, the trial in question was actually categorized as indecent assault (sexual), and about 38.6% of the time it was assault with intent (also sexual).  Almost 11% of the time, the trial was a case of assault with sodomitical intent, and another 8% of the trials were actually categorized as an instance of wounding.  In other words, about 96% of the learner's false positive "errors" in this case were other kinds of assault.  What of the trials classified as "miscellaneous - other"?  One was &lt;a href="http://www.oldbaileyonline.org/browse.jsp?id=t18360919-2166&amp;div=t18360919-2166"&gt;this trial&lt;/a&gt;, where 44 year old William Blackburn was found guilty of "unlawfully and maliciously administering to Hannah Mary Turner 6 drachms of tincture of cantharides, with intent to excite, &amp;amp;c."  I understand that this case probably doesn't fit the definition of assault used by either Blackburn's contemporaries or by the person who coded the file.  Nevertheless, it is not completely unrelated to the idea of an assault, and is exactly the kind of source that a historian could use to shed light on gender relations, sexuality, or other topics.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlxcviDucI/AAAAAAAAAM4/W3M6uTN4xIc/s1600-h/ob-tfidf50-1830s-fps-breakingpeace-assault.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 133px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlxcviDucI/AAAAAAAAAM4/W3M6uTN4xIc/s200/ob-tfidf50-1830s-fps-breakingpeace-assault.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285380376305580482" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The next figure shows the false positives for fraud, categorized as a kind of deception.  Seventy-two percent of the learner's false positives in this case were actually categorized as coining offences, and another 12% were actually cases of forgery.  Once again, the vast majority of cases that were incorrectly identified as fraud belonged to relatively closely related offence categories.  Note that these results cannot be explained by appealing to the distribution of offences in the sample as a whole.  If the false positives were selected by the learner at random, we would expect most of them to be cases of larceny, which are by far the most commonly attested.  Instead we see that a learner trained to recognize one kind of assault is confused by other kinds of assault, and one trained on fraud by other kinds of fraud.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlxpUchLHI/AAAAAAAAANA/4XwaVwO56U8/s1600-h/ob-tfidf50-1830s-fps-deception-fraud.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 134px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlxpUchLHI/AAAAAAAAANA/4XwaVwO56U8/s200/ob-tfidf50-1830s-fps-deception-fraud.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285380592372886642" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A learner trained on manslaughter is mostly confused by cases of wounding and murder, as shown in the next figure.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlx0e-gnOI/AAAAAAAAANI/DeKXaNFs6mU/s1600-h/ob-tfidf50-1830s-fps-kill-manslaughter.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 142px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlx0e-gnOI/AAAAAAAAANI/DeKXaNFs6mU/s200/ob-tfidf50-1830s-fps-kill-manslaughter.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285380784178371810" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Finally we can consider a kind of theft, in this case housebreaking.  If any learner were going to be confused by larceny cases, it should be one trained to recognize a type of theft.  Instead, this learner is more confused by the less-frequently attested but more closely related categories of burglary and theft from place.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlx98SrtmI/AAAAAAAAANQ/lVC1wSWjAPw/s1600-h/ob-tfidf50-1830s-fps-theft-housebreaking.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 136px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlx98SrtmI/AAAAAAAAANQ/lVC1wSWjAPw/s200/ob-tfidf50-1830s-fps-theft-housebreaking.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285380946666436194" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Now we are in a position to provide one kind of answer to the question, "what good is an error-prone learner?"  Since the learner's errors are meaningfully related to its successful ability to categorize, we can use false positives as a way of generalizing beyond the bounds of hard and fast categorization.  If we used a search engine to find cases of assault we might miss some of the most interesting such cases (like the cantharides example) ... cases that are interesting precisely because they lay just outside the category.  One of the things that machine learning gives us, is a way of finding some of the more interesting exceptions to our rules.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-2276003148037320539?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2276003148037320539'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2276003148037320539'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-12.html' title='A Naive Bayesian in the Old Bailey, Part 12'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlxcviDucI/AAAAAAAAAM4/W3M6uTN4xIc/s72-c/ob-tfidf50-1830s-fps-breakingpeace-assault.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-881713688499658309</id><published>2008-06-25T14:42:00.009-04:00</published><updated>2008-12-29T19:52:46.421-05:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 11</title><content type='html'>We feel pretty confident that the performance of the TFIDF-50 version of the naive bayesian learner is going to be relatively stable regardless of the frequency with which a particular offence is attested.  At this point we can write a routine which tests the learner on each of the offences which occurred 10 or more times in the 1830s.  Our testing routine takes advantage of the fact that, unlike many other kinds of machine learner, the naive bayesian can be operated in &lt;em&gt;online&lt;/em&gt; mode.  What this means is that we can train the learner on some data, test its performance, then train it on some more data.  Many learners can only be operated in &lt;em&gt;offline&lt;/em&gt; or &lt;em&gt;batch&lt;/em&gt; mode.  This means they have to be trained on all of the data before they can be tested, and there is no way at that point to subject them to further training.  The fact that the naive bayesian can be used for online learning will turn out to be crucial for us.&lt;br /&gt;&lt;br /&gt;The code for testing is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-online-learning.py.html"&gt;here&lt;/a&gt;. The learner is given the trials in chronological order, one at a time. The way that the program works is that it first uses the current state of the learner to classify a trial.  The classification is scored as a hit, miss, false positive or correct negative, then the trial is used to train the learner (with the appropriate category being given as feedback).  The learner is then given the next trial to judge.  Once the learner has seen all of the data, the final count of hits, misses, etc. is output and the performance plotted as in previous posts.  The results are shown below for the 1830s.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlwq8OCQHI/AAAAAAAAAMo/yvRFYcNUBVs/s1600-h/ob-tfidf50-1830s.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 182px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlwq8OCQHI/AAAAAAAAAMo/yvRFYcNUBVs/s200/ob-tfidf50-1830s.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285379520717799538" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As can be seen, the performance is pretty stable, considering that different offences make up values ranging between 0.077% (for perverting justice 10/12959) and 42.48% of the total (for simple larceny 5505/12959).  The system gets very few false positives for bigamy, and quite a few for shoplifting.  We'll look at why this is the case in the next post.  It is very accurate for the most frequently attested offence, simple larceny, and relatively inaccurate for the infrequently attested offences of kidnapping (11/12959) and perverting justice (10/12959).  The central part of the plot is magnified and shown in the figure below.  The performance of the learner varies for similar sorts of crime (e.g., it performs better for indecent assault than assault), something that we will take up next.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlw0JdD-CI/AAAAAAAAAMw/4yMEuR_1B2I/s1600-h/ob-tfidf50-1830s-detail.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 163px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlw0JdD-CI/AAAAAAAAAMw/4yMEuR_1B2I/s200/ob-tfidf50-1830s-detail.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285379678889310242" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-881713688499658309?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/881713688499658309'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/881713688499658309'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-11.html' title='A Naive Bayesian in the Old Bailey, Part 11'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlwq8OCQHI/AAAAAAAAAMo/yvRFYcNUBVs/s72-c/ob-tfidf50-1830s.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5922249567385697333</id><published>2008-06-22T12:37:00.005-04:00</published><updated>2008-12-29T19:50:36.456-05:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 10</title><content type='html'>In our last post, we settled on a style of plotting that shows both how accurate our learner is (i.e., does it miss very often?) and how precise (i.e., how often does it return a false positive?)  We also decided to do experiments with the version of the naive bayesian learner that uses the items with the highest tf-idf as features.  Our experiments to date have used the category of simple larceny in the 1830s.  This offence is very well-attested, making up about 42.5% of the trials (5505/12959).  At this point, we can try the performance of the same learner on offence categories that are less frequent: stealing from master (1718/12959, approx. 13.3%) and burglary (279/12959, approx 2.2%).  We've been using the 15 terms with the highest tf-idf, but we should try some other values for that parameter, too.  A graph for the three different offence categories is shown below.  The four learners use the top scoring 15, 30, 50 and 100 items, respectively.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlwThXVMOI/AAAAAAAAAMg/PJbc_NQFivk/s1600-h/ob-tfidf-comparison.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlwThXVMOI/AAAAAAAAAMg/PJbc_NQFivk/s200/ob-tfidf-comparison.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285379118372040930" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;From the graph, it is pretty clear that it is easiest to learn to categorize larceny, which is the best-attested offence we looked at.  We can also see that the TFIDF-15 learner does particularly poorly by missing many instances of the less frequent offences.  Increasing the number of features the learner can make use of seems to improve performance up to a point.  After that, increasing features increases the number of false positives the learner makes.  We want the performance of our learner to be relatively robust when learning offence categories that are more or less frequently attested, which means we want the learner with the tightest grouping of results for these test categories (in other words, TFIDF-50).&lt;br /&gt;&lt;br /&gt;Note that in this test, we only ran each learner once on each data set, rather than doing ten-fold cross-validation.  Our &lt;a href="http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-7.html"&gt;experiments with cross-validation&lt;/a&gt; suggested that the different versions of the learner were relatively insensitive to the order in which training and testing trials were presented.  Since this is exploratory work, we will make the (possibly incorrect) assumption that a single trial is probably representative.  This will let us do a lot more testing in the same amount of time.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5922249567385697333?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5922249567385697333'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5922249567385697333'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-10.html' title='A Naive Bayesian in the Old Bailey, Part 10'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlwThXVMOI/AAAAAAAAAMg/PJbc_NQFivk/s72-c/ob-tfidf-comparison.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3912810453593271867</id><published>2008-06-21T16:56:00.006-04:00</published><updated>2008-12-29T19:49:23.980-05:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 9</title><content type='html'>There are many different ways to measure the performance of our various learning algorithms.  The error rate that we've been using so far we defined as the sum of misses and false positives divided by the total number of trials.  By this measure, &lt;span style="font-style:italic;"&gt;COINFLIP&lt;/span&gt; had an average error rate around 50%, and our naive bayesian learner had an error rate around 40% using one word features, and around 26% using either 2-grams or top-scoring tf-idf features.  I thought I might be able to get better performance by using only those 2-grams that included terms with a high tf-idf, but that learner had an error rate around 26%, too.  (Recall that we've been using cases of simple larceny in the 1830s for our experiments... the performance will be different for other offences and/or other decades.  We'll test some of these soon.)&lt;br /&gt;&lt;br /&gt;By using a different measure, we can see that our various learners achieve their results in different ways.  From our perspective as researchers, the least interesting category of answers are the correct negatives.  Misses are a problem, because they may contain evidence that relates to the argument that we're trying to construct.  False positives are a problem, because they are irrelevant but we have to look through them to determine that... in other words, they're a waste of time.  A perfect learner would return all and only hits.  If we consider the ratio of misses to hits we can get an idea of how accurate our learner is.  As a learner gets better, the ratio of misses to hits approaches 0.  As it gets worse, the ratio increases.  A disastrous learner might not get any hits, so to avoid a division by zero error, we'll add one to the denominator.  Our accuracy measure is thus &lt;span style="font-style:italic;"&gt;misses / (hits + 1)&lt;/span&gt;.  If we consider the ratio of false positives to hits we can find out how precise our learner is.  As it gets better, this ratio will go to zero, and as it gets worse, the ratio will increase.  Our precision measure is &lt;span style="font-style:italic;"&gt;false positives / (hits + 1)&lt;/span&gt;.  We can plot both measures on the same graph, with the origin in the lower left hand corner, as shown below.  Since some of the values are large, I've used logarithmic axes.  (Also, the results for &lt;span style="font-style:italic;"&gt;YES&lt;/span&gt; and &lt;span style="font-style:italic;"&gt;NO&lt;/span&gt; actually lie on the respective zero lines, but I've bumped them over so they can be seen in this plot.)&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlwAbAKiHI/AAAAAAAAAMY/AAHwKQYAKi4/s1600-h/ob-larceny-stats-2.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 184px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlwAbAKiHI/AAAAAAAAAMY/AAHwKQYAKi4/s200/ob-larceny-stats-2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285378790246746226" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Looking at the graph we notice some interesting results.  The naive bayesian that uses words for features gets relatively few false positives, but at the cost of missing an order of magnitude more items than the other two learners.  The 2-gram learner outperforms &lt;span style="font-style:italic;"&gt;COINFLIP&lt;/span&gt; and the tf-idf learner on false positives, but not on misses.  The tf-idf learner is the only one that outperforms &lt;span style="font-style:italic;"&gt;COINFLIP&lt;/span&gt; in terms of both accuracy and precision.  Thus we will do our next round of experiments with the tf-idf learner. &lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3912810453593271867?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3912810453593271867'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3912810453593271867'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-9.html' title='A Naive Bayesian in the Old Bailey, Part 9'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlwAbAKiHI/AAAAAAAAAMY/AAHwKQYAKi4/s72-c/ob-larceny-stats-2.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3180635249931041919</id><published>2008-06-18T20:03:00.007-04:00</published><updated>2008-06-18T20:45:31.667-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 8</title><content type='html'>In the last post, we got a naive bayesian learner working and used it to categorize some Old Bailey trials from the 1830s as examples of larceny (or not).  Our initial version of the learner was easy to implement, but it made the unrealistic assumption that the probabilities of particular words appearing in the text of a trial were independent.  That greatly simplified computation at the cost of performance.  Our initial learner had an error rate around 40%.  We then revised it to use 2-grams as features rather than individual words.  This captured some of the dependency between words, improving our average error rate so it was close to 25%.&lt;br /&gt;&lt;br /&gt;An alternative approach is to try and concentrate on the words in a trial which are most representative of a particular category.  Without specifying these words in advance, we can make the assumption that they will be relatively frequent in the document in question, but relatively infrequent in the overall corpus of documents.  One common measure for this is known as &lt;a href="http://en.wikipedia.org/wiki/Tfidf"&gt;tf-idf&lt;/a&gt;.  Rather than handing all of the words in a given trial to our learner, or all except the stop words, we will only hand off the 15 or 20 with the highest tf-idf.  There are many different ways to compute this measure.  The version that I used is &lt;span style="font-style: italic;"&gt;tfidf = log(tf+1.0) * log(numdocs/df)&lt;/span&gt;, where &lt;span style="font-style: italic;"&gt;tf&lt;/span&gt; is the number of times the word occurs in a particular text, &lt;span style="font-style: italic;"&gt;numdocs&lt;/span&gt; is the total number of documents, and &lt;span style="font-style: italic;"&gt;df&lt;/span&gt; is number of documents that the word appears in.  The word "cellar," for example, appears in &lt;a href="http://www.oldbaileyonline.org/browse.jsp?div=t18341124-1"&gt;this trial&lt;/a&gt; seventeen times, and in 221 other trials in the 1830s.  The &lt;span style="font-style: italic;"&gt;tfidf&lt;/span&gt; for this word in this trial is &lt;span style="font-style:italic;"&gt;log(17+1) * log(12959/221) = 11.76781&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;To compute the tf-idf, we first need to create a list of every word that was used in all of the trials, and the number of different trials in which each word appears.  We could put this information in a text file, but the file would be huge and very slow to access.  Instead, we will store our document frequencies in a &lt;a href="http://www.sqlite.org/"&gt;SQLite&lt;/a&gt; database, using Python commands to store and retrieve the information.  The code which creates this database is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-compute-doc-freqs.py.html"&gt;here&lt;/a&gt;.  We can then compute the tf-idf scores for each word in a given trial, creating a new directory to store these files.  The code to do that is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-compute-tfidf.py.html"&gt;here&lt;/a&gt;.  Finally, we will want a version of our tenfold cross-validation routine to test the performance of a naive bayesian learner that operates across tf-idf vectors rather than raw words or 2-grams (&lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-cross-validate-tfidf-learner.py.html"&gt;here&lt;/a&gt;).  This new learner has similar performance to the 2-gram version, with an average error rate of 25.73% when using the 15 highest scoring tf-idf terms to categorize cases of larceny in the 1830s.  As a bonus, it is remarkably fast.  At this point, you're probably wondering what good a machine learner is, if one quarter of its judgments are incorrect.  We'll get there.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3180635249931041919?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3180635249931041919'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3180635249931041919'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-8.html' title='A Naive Bayesian in the Old Bailey, Part 8'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4271132176142741411</id><published>2008-06-17T10:46:00.012-04:00</published><updated>2008-12-29T19:46:26.816-05:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 7</title><content type='html'>At last we're in a position to actually train and test some machine learners.  The one that we'll start with is called a &lt;span style="font-style: italic;"&gt;naive bayesian&lt;/span&gt;.  It is relatively simple to implement, although it usually doesn't perform nearly as well as fancier and more complicated learners.  For our purposes, however, it has some real advantages, which we'll get to spelling out eventually.  The version of the naive bayesian learner that I am going to use is the one that was implemented by Toby Segaran in his book &lt;span style="font-style: italic;"&gt;&lt;a href="http://www.amazon.com/Programming-Collective-Intelligence-Building-Applications/dp/0596529325/"&gt;Programming Collective Intelligence&lt;/a&gt;&lt;/span&gt;.  I won't post the code for the learner here, as it is already &lt;a href="http://blog.kiwitobes.com/?p=44"&gt;available online&lt;/a&gt;.  If you are able to follow this series of posts and are interested in writing machine learning code in Python, Toby's book is a must-have.  The only change that I have implemented is to remove stop words before submitting the trials for training or testing.  You can get instructions and code for that from &lt;span style="font-style: italic;"&gt;&lt;a href="http://niche.uwo.ca/programming-historian/index.php/Main_Page"&gt;The Programming Historian&lt;/a&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Bayesian learners make use of a theorem proposed by Thomas Bayes and published in 1763, two years after his death (for more on Bayes, see Bellhouse's &lt;a href="http://www.york.ac.uk/depts/maths/histstat/bayesbiog.pdf"&gt;biography&lt;/a&gt;.)  The theorem states that &lt;span style="font-style:italic;"&gt;Pr[H|E] = (Pr[E|H] * Pr[H]) / Pr[E]&lt;/span&gt;.  &lt;span style="font-style:italic;"&gt;Pr[H|E]&lt;/span&gt; is the probability that the hypothesis &lt;span style="font-style:italic;"&gt;H&lt;/span&gt; is true, given some evidence &lt;span style="font-style:italic;"&gt;E&lt;/span&gt;.  &lt;span style="font-style:italic;"&gt;Pr[E|H]&lt;/span&gt; is the probability that you would see evidence &lt;span style="font-style:italic;"&gt;E&lt;/span&gt; if the hypothesis &lt;span style="font-style:italic;"&gt;H&lt;/span&gt; were true.  &lt;span style="font-style:italic;"&gt;Pr[H]&lt;/span&gt; is the probability of the hypothesis and &lt;span style="font-style:italic;"&gt;Pr[E]&lt;/span&gt; the probability of the evidence.  Bayes theorem gives us a way of determining conditional probabilities: if we know one thing, how likely are we to know something else?&lt;br /&gt;&lt;br /&gt;Let's work through a simple example.  Suppose bag A contains one black marble and three white ones, and bag B contains two white marbles and two black ones.  Someone gives us a black marble but doesn't remember which bag they took it from.  Given that you have a black marble, what are the chances that it came from bag A?  In this case, &lt;span style="font-style:italic;"&gt;Pr[H]&lt;/span&gt; is the probability the marble came from bag A.  Since each bag contains the same number of marbles, &lt;span style="font-style:italic;"&gt;Pr[H] = 4/8 = 1/2&lt;/span&gt;.  &lt;span style="font-style:italic;"&gt;Pr[E]&lt;/span&gt; is the probability that a marble is black, so  &lt;span style="font-style:italic;"&gt;Pr[E] = (1+2)/8 = 3/8&lt;/span&gt;. &lt;span style="font-style:italic;"&gt;Pr[E|H]&lt;/span&gt; is the probability that you are going to get a black marble if you choose from bag A, in other words &lt;span style="font-style:italic;"&gt;Pr[E|H] = 1/4&lt;/span&gt;.  So Bayes theorem says that &lt;span style="font-style:italic;"&gt;Pr[H|E] = (1/4*1/2) / 3/8 = 1/3&lt;/span&gt;.  Since we know that the marble had to come from one of the two bags, that means that it should have a 2/3 chance of coming from bag B, which we can double check.  &lt;span style="font-style:italic;"&gt;Pr[notH|E] = (Pr[E|notH] * Pr[notH]) / Pr[E] = (2/4*1/2) / 3/8 = 2/3&lt;/span&gt;, as expected.  You can learn more about Bayes theorem &lt;a href="http://plato.stanford.edu/entries/bayes-theorem/"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;When applied to the problem of learning, Bayes theorem looks like this: &lt;span style="font-style:italic;"&gt;Pr[category|document] = Pr[document|category] * Pr[category]&lt;/span&gt;.  (We don't need to divide by &lt;span style="font-style:italic;"&gt;Pr[document]&lt;/span&gt; in this equation because it will scale all of our results by the same amount).  We make the (incorrect) assumption that the probability of each word in the document is independent from the others, so we can set &lt;span style="font-style:italic;"&gt;Pr[document|category]&lt;/span&gt; equal to &lt;span style="font-style:italic;"&gt;Pr[word1|category] * Pr[word2|category] * ...&lt;/span&gt; Finally, &lt;span style="font-style:italic;"&gt;Pr[category]&lt;/span&gt; is simply the proportion of all documents that belong to our category of interest.&lt;br /&gt;&lt;br /&gt;So how well does the naive bayesian learner do?  Not very well.  In a tenfold cross-validation run testing for cases of simple larceny in the 1830s it has an average error rate of 39.17%, compared with &lt;span style="font-style:italic;"&gt;COINFLIP&lt;/span&gt;'s average error rate of 49.39%.  The error rate is simply &lt;span style="font-style:italic;"&gt;(Misses + False Positives) / Total Number of Trials&lt;/span&gt;.  Part of the problem is that we made the assumption that the probability of any word in a document is independent of the probability of any other word in the same document.  We know this isn't strictly true.  In the Old Bailey proceedings, for example, you find both "dwelling" and "dwelling house", as well as "victualling house", "sessions house", "station house", "house keeper" and many other forms.  To the extent that these and other words tend to co-occur, the word probabilities can't be independent.  We can improve the performance of our naive bayesian learner by using pairs of words (i.e., 2-grams) rather than individual words as features for the learner.  This drops the error rate to 26.23% when categorizing trials for simple larceny in the 1830s.  The code that tests the different learners is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-cross-validate-learner.py.html"&gt;here&lt;/a&gt;.  A graph of performance is shown below.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlvVFpgkSI/AAAAAAAAAMQ/PlMaWdOQjQU/s1600-h/ob-cross-validation-larceny-stats.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 126px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlvVFpgkSI/AAAAAAAAAMQ/PlMaWdOQjQU/s200/ob-cross-validation-larceny-stats.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285378045780201762" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4271132176142741411?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4271132176142741411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4271132176142741411'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-7.html' title='A Naive Bayesian in the Old Bailey, Part 7'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlvVFpgkSI/AAAAAAAAAMQ/PlMaWdOQjQU/s72-c/ob-cross-validation-larceny-stats.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-8843049022315858124</id><published>2008-06-13T06:32:00.004-04:00</published><updated>2008-06-13T09:07:59.182-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 6</title><content type='html'>Now that we have our training and testing samples, we will be able to estimate the error rates of our various machine learners.  Some of them won't be very good, especially if they are trained on relatively small or unrepresentative samples.  None of them will be perfect, or even approach human performance.  So it is usually a good idea to ask if the performance of a given learner is significantly different from chance.  Consider three other abstract machines which don't do any learning at all.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; is a very simple machine.  When given an item and asked whether or not it is an instance of a particular category, &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; says "yes".  That's it.  Suppose we have 100 test items and all of them are instances of our category, say 100 examples of burglary.  We ask &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; about each of them and it 'decides' that each is a burglary.  &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; makes no errors at all on this test sample!  If half of the test items are not burglaries, however, &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt;'s error rate climbs to 50%.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;NO&lt;/span&gt; is also a very simple machine, responding "no" whenever tested.  If we give it 100 examples of burglaries, it will fail to recognize every single one of them, with an error rate of 100%.  The fewer burglaries our test sample contains, the better &lt;span style="font-style: italic;"&gt;NO&lt;/span&gt; does.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; is more sophisticated than &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; or &lt;span style="font-style: italic;"&gt;NO&lt;/span&gt;.  Every time we ask &lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; to make a decision, it has a 50% chance of responding "yes" and a 50% chance of responding "no".  Given a sample with 100 examples of burglaries, &lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; gets it wrong about half the time.  Given a sample with no burglaries in it, &lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; will also have an error rate around 50%.&lt;br /&gt;&lt;br /&gt;With these three simple machines, we can be more clear about what it means to be right or wrong, distinguishing four categories:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Hit&lt;/span&gt;.  If the machine says "yes" and the right answer is "yes", we say that it has scored a hit.  This is one kind of correct answer.  Both &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; are capable of scoring hits, but &lt;span style="font-style: italic;"&gt;NO&lt;/span&gt; never is, because it can never say "yes" to anything.&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;False Positive&lt;/span&gt;.  If the machine says "yes" but the answer is really "no", we say that it has responded with a false positive, which is one kind of incorrect answer.  &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; can reply with false positives, but &lt;span style="font-style: italic;"&gt;NO&lt;/span&gt; cannot.&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Miss&lt;/span&gt;.  If the machine says "no" but the correct answer was "yes", we say that it missed.  &lt;span style="font-style: italic;"&gt;NO&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; can miss, but &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; cannot, because it never says "no".&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;Correct Negative&lt;/span&gt;.  This happens when the machine says "no" and the correct answer was "no".  &lt;span style="font-style: italic;"&gt;NO&lt;/span&gt; and &lt;span style="font-style: italic;"&gt;COINFLIP&lt;/span&gt; can reply with correct negatives, but &lt;span style="font-style: italic;"&gt;YES&lt;/span&gt; cannot.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;We expect our learners to produce answers in each of the four categories.  A machine that always hits will also tend to identify a lot of false positives.  This can be good if you are looking for a needle in a haystack, but will overwhelm you if your category is well-attested.  A machine that always identifies correct negatives will often miss things.  These kind of machines tend to be more useful when you would never have time to go through all of your items by hand.  Most machine learners have parameters that allow you to tune their performance between these extremes.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8843049022315858124?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8843049022315858124'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8843049022315858124'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-6.html' title='A Naive Bayesian in the Old Bailey, Part 6'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7550786387838578710</id><published>2008-06-12T10:12:00.007-04:00</published><updated>2008-06-12T10:46:59.548-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 5</title><content type='html'>With most of our support routines in place, we need to think about the problem of training a machine learner and then assessing its performance.  A human being has already gone through each of the trials and assigned one or more offence categories to it:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://www.oldbaileyonline.org/browse.jsp?div=t18341124-1"&gt;this trial&lt;/a&gt; is a burglary, which is a kind of theft&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.oldbaileyonline.org/browse.jsp?div=t18341124-2"&gt;this trial&lt;/a&gt; is also a burglary&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.oldbaileyonline.org/browse.jsp?div=t18341124-3"&gt;this trial&lt;/a&gt; is a wounding, which is a way of breaking the peace&lt;/li&gt;&lt;li&gt;...&lt;/li&gt;&lt;/ul&gt;So we can give each raw trial to our learner and ask it to decide what offence category the trial belongs to, then we can check our learner's answer against the human-assigned category.  If we do enough of these trials, we can get a precise sense of how good our learner is.&lt;br /&gt;&lt;br /&gt;Most machine learning researchers use a &lt;em&gt;holdout method&lt;/em&gt; to test the performance of their learning algorithms. They use part of the data to train the system, then test its performance on the remaining part, the part that wasn't used for training.  Items are randomly assigned to either the training or the testing pile, with the further stipulation that both piles should have the same distribution of examples.  Since burglaries made up about 2.153% (279/12959) of the trials in the 1830s, we want burglaries to make up about two percent of the training data and about two percent of the test data.  It would do us no good for all of the burglaries to end up in one pile or the other.&lt;br /&gt;&lt;br /&gt;But how do we know whether the results that we're seeing are some kind of fluke?  We use &lt;span style="font-style: italic;"&gt;cross-validation&lt;/span&gt;.  We randomly divide our data into a number of piles (usually 10), making sure that the category that we are interested in is uniformly distributed across those piles.  Now, we set aside the first pile and use the other nine piles to train our learner.  We then test it on the first pile and record its performance.  We then set aside the second pile for testing, and use the other nine piles for training a new learner.  And so on, until each item has been used both for testing and for training.  We can then average the ten error estimates.  There are many other methods in the literature, of course, but this one is fairly standard.&lt;br /&gt;&lt;br /&gt;Code to create a tenfold cross-validation sample from our data is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-tenfold-crossvalidation-sample.py.html"&gt;here&lt;/a&gt;.  As a check, we'd also like to make sure that our offence category is reasonably distributed across our sample (code for that is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-count-offence-instances.py.html"&gt;here&lt;/a&gt;).&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7550786387838578710?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7550786387838578710'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7550786387838578710'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-5.html' title='A Naive Bayesian in the Old Bailey, Part 5'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-8411379322644074055</id><published>2008-06-09T12:37:00.007-04:00</published><updated>2008-06-09T13:10:10.976-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 4</title><content type='html'>With raw text files for each of the trials, we're almost in a position to try doing some experiments with a machine learner.  Before we get started we are going to need a few utility routines to make our lives easier.  Programmers enjoy writing tools so much they have a special expression for the process: &lt;a href="http://www.catb.org/jargon/html/Y/yak-shaving.html"&gt;yak shaving&lt;/a&gt;.  Sometimes it's necessary, sometimes it's just fun, sometimes it's a great way to procrastinate.  We'll try to keep it in check.&lt;br /&gt;&lt;ul&gt;&lt;li&gt;First of all, we'll want lists of all of the files that need to be processed in a given decade. We could use the operating system for this, but Windows is pretty slow when you have tens of thousands of files in a directory.  A program to grab the list of filenames to another text file is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-trial-id-list.py.html"&gt;here&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;We're also going to want a list of all of the dates on which trials occurred (in other words, we will want a list of all of the days that the court was in session).  The program to generate that list and sort it in ascending order is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-get-date-range.py.html"&gt;here&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Since our initial experiments will be focused on trying to automatically categorize trials by offence (e.g., "burglary"), we are going to need a few routines that make it easier to work with offences.  One of these needs to return a mapping from trial IDs to one or more categories of offence (the code is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-offence-category.py.html"&gt;here&lt;/a&gt;):&lt;br /&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;t-18341124-1.txt -&gt; theft-burglary&lt;/li&gt;&lt;li&gt;t-18341124-2.txt -&gt; theft-burglary&lt;/li&gt;&lt;li&gt;t-18341124-3.txt -&gt; breakingpeace-wounding&lt;/li&gt;&lt;li&gt;...&lt;/li&gt;&lt;li&gt;t-18341124-37.txt -&gt; theft-stealingfrommaster, theft-simplelarceny&lt;/li&gt;&lt;li&gt;...&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Another routine needs to return a mapping from a particular offence to a list of matching trial IDs (the code is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-offence-index.py.html"&gt;here&lt;/a&gt;):&lt;/li&gt;&lt;ul&gt;&lt;li&gt;theft-burglary -&gt; t-18341124-1.txt, t-18341124-183.txt, t-18341124-185.txt, t-18341124-2.txt, t-18341124-4.txt, ...&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Finally, we are going to need to have some idea of how many offences there were of each kind in a particular decade (the code is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-count-offences.py.html"&gt;here&lt;/a&gt;).  For the 1830s, the data look like the following:&lt;/li&gt;&lt;ul&gt;&lt;li&gt;breakingpeace-assault.txt|51&lt;/li&gt;&lt;li&gt;breakingpeace-libel.txt|7&lt;/li&gt;&lt;li&gt;breakingpeace-riot.txt|5&lt;/li&gt;&lt;li&gt;breakingpeace-threateningbehaviour.txt|4&lt;/li&gt;&lt;li&gt;breakingpeace-wounding.txt|166&lt;/li&gt;&lt;li&gt;breakingpeace.txt|1&lt;/li&gt;&lt;li&gt;damage-arson.txt|7&lt;/li&gt;&lt;li&gt;damage-other.txt|1&lt;/li&gt;&lt;li&gt;...&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;That about does it for the utility routines.  Next we have to address the problem of sampling.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8411379322644074055?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8411379322644074055'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8411379322644074055'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-4.html' title='A Naive Bayesian in the Old Bailey, Part 4'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1382692167822775089</id><published>2008-06-07T11:08:00.004-04:00</published><updated>2008-06-07T11:36:11.788-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 3</title><content type='html'>At this point, we have a directory that contains one file for each trial in a given decade.  There are a lot of these files (almost 13,000 for the 1830s alone) and each trial is still marked up with XML.  In the next step we're going to create parallel directories that contain trial files with all of the XML stripped out.  In other words, our trial files currently look like this:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&amp;lt;trial id="t-18341124-1" n="1"&amp;gt;&amp;lt;charge n="1" defid="def1-1-18341124" offenceno="1" verdictno="1"&amp;gt;&amp;lt;/charge&amp;gt;&amp;lt;charge n="2" defid="def2-1-18341124" offenceno="1" verdictno="2"&amp;gt;&amp;lt;p&amp;gt;&lt;/span&gt;1. &lt;span style="font-style: italic;"&gt;&amp;lt;name role="defendant" id="def1-1-18341124" age="30" sex="m" given="JOHN" occupation="na" surname="HOLGATE"&amp;gt;&amp;lt;lc&amp;gt;&lt;/span&gt;JOHN HOLGATE&lt;span style="font-style: italic;"&gt;&amp;lt;/lc&amp;gt;&amp;lt;/name&amp;gt;&lt;/span&gt; and&lt;br /&gt;&lt;span style="font-style: italic;"&gt;&amp;lt;name role="defendant" id="def2-1-18341124" age="27" sex="m" given="JAMES" occupation="na" surname="HOLGATE"&amp;gt;&amp;lt;lc&amp;gt;&lt;/span&gt;JAMES HOLGATE&lt;span style="font-style: italic;"&gt;&amp;lt;/lc&amp;gt;&amp;lt;/name&amp;gt;&amp;lt;offence n="1" ids="def1-1-18341124 def2-1-18341124"&amp;gt;&amp;lt;theft category="burglary"&amp;gt;&lt;/span&gt; were indicted, for that they, on the &lt;span style="font-style: italic;"&gt;&amp;lt;cd&amp;gt;&lt;/span&gt;1st of October&lt;span style="font-style: italic;"&gt;&amp;lt;/cd&amp;gt;&lt;/span&gt;, at &lt;span style="font-style: italic;"&gt;&amp;lt;geo&amp;gt;&lt;/span&gt;St. Mary Magdalen, Bermondsey&lt;span style="font-style: italic;"&gt;&amp;lt;/geo&amp;gt;&lt;/span&gt;, about four o'clock in the night, the dwelling-house of &lt;span style="font-style: italic;"&gt;&amp;lt;name age="na" given="JOHN" residency="st mary magdalen bermondsey" role="victim" sex="m" surname="THOMPSON"&amp;gt;&lt;/span&gt;John Thompson&lt;span style="font-style: italic;"&gt;&amp;lt;/name&amp;gt;&lt;/span&gt;, feloniously and burglariously did break and enter...&lt;/blockquote&gt;&lt;br /&gt;and we're going to make copies that look like this:&lt;br /&gt;&lt;blockquote&gt;1 john holgate and james holgate were indicted for that they on the 1st of october at st mary magdalen bermondsey about four o clock in the night the dwelling house of john thompson feloniously and burglariously did break and enter...&lt;/blockquote&gt;&lt;br /&gt;It may seem a bit perverse to take out information that the OB team worked very hard to create, so it is probably a good idea to step back and get a broader overview of the data mining process.  When writing programs to manipulate digital sources you can head down one of two paths.  You can choose to explicitly encode more and more semantic (i.e., meaningful) information.  This is what the OB team has done with XML markup.  By using &lt;span style="font-style:italic;"&gt;&amp;lt;geo&amp;gt;...&amp;lt;/geo&amp;gt;&lt;/span&gt; tags to indicate that "St. Mary Magdalen, Bermondsey" is a geographical location, they are able to provide a powerful search engine that can &lt;a href="http://www.hrionline.ac.uk/ccc/forms/formMaps.jsp"&gt;find places&lt;/a&gt;.  Similarly, by indicating the age and sex of criminals and victims they make it possible for researchers to do a variety of sophisticated &lt;a href="http://www.hrionline.ac.uk/ccc/forms/formStats.jsp"&gt;statistics&lt;/a&gt; on the archive as a whole.  The downside, of course, is that this kind of explicit tagging is very labor-intensive.  It is wonderful to be able to work with a digital archive that someone else has edited and marked up, but often you face a corpus of documents that is little better than raw text, or worse, that contains a high percentage of &lt;a href="http://en.wikipedia.org/wiki/Optical_character_recognition"&gt;OCR&lt;/a&gt; errors.&lt;br /&gt;&lt;br /&gt;An alternative approach is to work with domain-neutral representations and algorithms.  You write programs that can't tell the difference between a person's name and a place name, between English and French, or between natural language and a genomic sequence.  This is closer to what traditional search engines like Google do.  The downside is that you can search for text that includes the string "Bermondsey" but you can't tell Google to limit your search to geographic uses of the term.  Instead your results include the neighborhood, the tube station, a local history group, a diving club, a biography, a hymn, some photos, and so on.&lt;br /&gt;&lt;br /&gt;Having access to text that has been semantically marked up makes it possible to create and test a wide range of powerful tools that can then be used on raw text that hasn't been marked up.  For example, we know that &lt;a href="http://www.oldbaileyonline.org/browse.jsp?div=t18341124-1"&gt;this particular trial&lt;/a&gt; was for a burglary that ended with the execution of two men.  Suppose we want to create a computer program which can classify trials as either "burglary" or "not a burglary."  We start by creating an archive of raw text examples by stripping out the markup.  We give these texts to our program, one by one, and tell it whether each text was an instance of "burglary" or not.  With any luck, the program learns, somehow, to distinguish burglaries from other kinds of trial.  (The details will be filled in as we go along).  Now, we can test the program on other examples from this archive and get a precise sense of how well it does.  If it seems to work, we can then use it to try and ferret out burglaries from other collections of untagged trial records, or even from a mass of undifferentiated search results.&lt;br /&gt;&lt;br /&gt;So, to create a clean copy of each of the trial files we're going to use a very brute force method.  We will simply copy the file, one character at a time, skipping any characters that fall between &lt;span style="font-style:italic;"&gt;&amp;lt;&lt;/span&gt; and &lt;span style="font-style:italic;"&gt;&amp;gt;&lt;/span&gt; inclusive.  The Python code which does this job is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-clean-copy-trials.py.html"&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1382692167822775089?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1382692167822775089'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1382692167822775089'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-3.html' title='A Naive Bayesian in the Old Bailey, Part 3'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5341592061435912948</id><published>2008-06-05T08:30:00.012-04:00</published><updated>2008-06-05T09:55:08.075-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 2</title><content type='html'>After downloading the XML-tagged files for the nineteenth century to our local machine, we ended up with a directory tree that looks like this:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Tagged_final (944 files, 9 folders)&lt;br /&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;Tagged_1830s_Files (62 files)&lt;/li&gt;&lt;ul&gt;&lt;li&gt;T18341124NW_SUP_DONE.xml&lt;/li&gt;&lt;li&gt;T18341205PH.xml&lt;/li&gt;&lt;li&gt;...&lt;/li&gt;&lt;li&gt;T18391216CLR.xml&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;li&gt;Tagged_1840s_Files (120 files)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;...&lt;/li&gt;&lt;li&gt;Tagged_1910s_Files (41 files)&lt;br /&gt;&lt;/li&gt;&lt;ul&gt;&lt;li&gt;T19100111GS_SUP_DONE.xml&lt;/li&gt;&lt;li&gt;T19100208GS.xml&lt;/li&gt;&lt;li&gt;...&lt;/li&gt;&lt;li&gt;T19130401CLR.xml&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/ul&gt;&lt;/ul&gt;Each XML file contains all of the trials that were conducted in a particular session.   The file 'T18341124NW_SUP_DONE.xml', for example, is the record for 24 Nov 1834.  I'm assuming that the string that follows the date in the filename ('NW_SUP_DONE') refers to the encoding process, so I'm going to ignore it.&lt;br /&gt;&lt;br /&gt;The next step is to split each of these XML files into individual trials.  Our overall strategy will be as follows.  First we want to create a directory for the trial files if one doesn't already exist.  Then we will get a list of all of the XML files for the decade and step through them one at a time.  For each XML file, we're going to extract each trial and save it as a separate file.  Since a given trial is delimited with tags that look like &lt;span style="font-style:italic;"&gt;&amp;lt;trial id="t-18341124-1" n="1"&amp;gt; ... &amp;lt;/trial&amp;gt;&lt;/span&gt;, we can parse it out and save it separately as 't-18341124-1.txt'.  You can read this &lt;a href="http://www.oldbaileyonline.org/browse.jsp?div=t18341124-1"&gt;trial&lt;/a&gt; online at the Old Bailey archives.  You can also have a look at the &lt;a href="http://www.oldbaileyonline.org/browse.jsp?path=sessionsPapers/18341124.xml&amp;div=t18341124-1&amp;xml=yes"&gt;XML file&lt;/a&gt; to see what we're dealing with.  The fact that the OB team provides XML makes this archive an &lt;span style="font-style:italic;"&gt;awesome&lt;/span&gt; resource for digital historians, and other online sites should do the same forthwith.&lt;br /&gt;&lt;br /&gt;There are a variety of ways to parse XML, but it is quick and easy to use the &lt;a href="http://www.crummy.com/software/BeautifulSoup/"&gt;Beautiful Soup&lt;/a&gt; library for Python.  The program that splits the XML files into separate trial files is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ob-split-into-trials.py.html"&gt;here&lt;/a&gt;; for more information about using Beautiful Soup see &lt;span style="font-style:italic;"&gt;&lt;a href="http://niche.uwo.ca/programming-historian/"&gt;The Programming Historian&lt;/a&gt;&lt;/span&gt;.  There are far more trial files than session files: there were 12,959 trials in the 1830s alone.  Now that we have one file for each trial, we're ready for the next step.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5341592061435912948?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5341592061435912948'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5341592061435912948'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/06/naive-bayesian-in-old-bailey-part-2.html' title='A Naive Bayesian in the Old Bailey, Part 2'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-8549888686799597402</id><published>2008-05-24T10:27:00.010-04:00</published><updated>2008-05-24T11:52:57.305-04:00</updated><title type='text'>A Naive Bayesian in the Old Bailey, Part 1</title><content type='html'>One of the great benefits of having a blog has been that people who are interested in digital history find me and let me know what they are doing in the field.  For a couple of years now, I've enjoyed an intermittent but invariably thought-provoking correspondence with &lt;a href="http://perseus.herts.ac.uk/uhinfo/schools/humanities/hist/history-staff/tim-hitchcock.cfm"&gt;Tim Hitchcock&lt;/a&gt;, one of the creators of the wonderful digital archive of the &lt;a href="http://www.oldbaileyonline.org"&gt;Old Bailey&lt;/a&gt; proceedings.  The OB team has recently added records for the period from 1834 to 1913, resulting in a total of almost 200,000 trial records, all tagged with XML.  When Tim offered me access to the XML files for a data mining project a few months ago, I jumped at the chance.  This is still very much work in progress, but I've decided to blog about the process for others who are interested in doing similar things, whether with the Old Bailey archive or some other.&lt;br /&gt;&lt;br /&gt;I started by downloading local copies of all of the files.  This is usually a good idea both because it makes the processing faster and because you aren't hammering the archive's servers every time you need to access a record.  There are a number of different ways to do something like this, and it is very handy for historians to be familiar with at least some of them.  One possibility is to use a Firefox extension like &lt;a href="http://www.downthemall.net/"&gt;DownThemAll&lt;/a&gt;.  This allows you to download all of the links or images in a webpage.  It also allows you to pause and resume the download process, which can be useful when you're working with a large number of files.  For those who are more comfortable with scripting and prefer command line tools, it is hard to beat &lt;a href="http://www.gnu.org/software/wget/"&gt;GNU Wget&lt;/a&gt;.  Both programs are free.  The third alternative is to write your own script in a language like Python or Perl.  This option is most difficult, but gives you more control over various kinds of preprocessing, like dealing with accented characters.  (For more, see the section on this in &lt;span style="font-style:italic;"&gt;&lt;a href="http://niche.uwo.ca/programming-historian/index.php/Harvesting_links_and_downloading_pages"&gt;The Programming Historian&lt;/a&gt;&lt;/span&gt;.)  It takes a while to download a large batch of files, but once you have them you're ready to move on to the next step.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/archive" rel="tag"&gt;archive&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/feature+space" rel="tag"&gt;feature space&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/text+mining" rel="tag"&gt;text mining&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8549888686799597402?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8549888686799597402'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8549888686799597402'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/05/naive-bayesian-in-old-bailey-part-1.html' title='A Naive Bayesian in the Old Bailey, Part 1'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4273935016697782040</id><published>2008-05-17T09:33:00.002-04:00</published><updated>2008-05-17T10:13:18.327-04:00</updated><title type='text'>Geo-DJ, Part 3: The Simplest Working Version</title><content type='html'>Some people may have the ability to come up with something awesome on their first pass--say, Athena springing from the forehead of Zeus fully formed--but I've learned that I have to make some mistakes along the way.  So I try to come up with the simplest working version of a project, then complexify it gradually.  Of course, things being what they are, you can usually improve something by simplifying it, so the first, apparently simplest, version is actually &lt;a href="http://digitalhistoryhacks.blogspot.com/2008/05/beginning-in-middle.html"&gt;somewhere in the middle&lt;/a&gt; of the scale from perfect to perfectly foobar.&lt;br /&gt;&lt;br /&gt;With the geo-DJ, I imagine the simplest working version to be something like a metal detector for historical landscape features.  Suppose you know that there used to be an electric streetcar running through the middle of downtown, but most material traces of it have since been torn up.  If you have a map of the streetcar route, you can use existing landmarks to georeference it, and determine the latitude and longitude of the endpoints (and any additional inflection points, but let's ignore those and work with a purely linear feature).  The locations of the endpoints need to be stored in memory.&lt;br /&gt;&lt;br /&gt;As the user walks around, the geo-DJ loops through the following algorithm.  First, determine the user's current position.  Then, determine the line through the endpoints (the former rail), determine the length of the perpendicular line from the user to the rail line (i.e., the magnitude of a normal vector), and scale the pitch of a tone that is playing in the headphones.  Repeat, ad infinitum... or until the batteries drain, whichever comes first.  If the user steps toward the rail, the pitch of the sound increases.  If he or she steps away from it, the pitch decreases.  Using this version of the system, a person can explore the lineaments of landscape features which may no longer exist.  See &lt;a href="http://mike.teczno.com/notes/scar-tissue.html"&gt;Michal Migurski's great air photo&lt;/a&gt; of San Francisco "healing" around a former railroad.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/electronica" rel="tag"&gt;electronica&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historical+consciousness" rel="tag"&gt;historical consciousness&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/place" rel="tag"&gt;place&lt;/a&gt; | &lt;a href="http://technorati.com/tag/place+based+computing" rel="tag"&gt;place-based computing&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4273935016697782040?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4273935016697782040'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4273935016697782040'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/05/geo-dj-part-3-simplest-working-version.html' title='Geo-DJ, Part 3: The Simplest Working Version'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3284534639175065867</id><published>2008-05-16T10:07:00.009-04:00</published><updated>2008-05-16T11:09:19.516-04:00</updated><title type='text'>Geo-DJ, Part 2: Storage vs. Computation</title><content type='html'>In my last post, I mentioned that I'm working with a couple of talented students this summer on digital history projects, and talked a bit about Adam Crymble's Zotero translators.  The other person who is working with me is &lt;a href="http://devonelliott.blogspot.com/"&gt;Devon Elliott&lt;/a&gt;.  Last year Devon came up with a plan to &lt;a href="http://devonelliott.blogspot.com/2007/10/wikiarchives.html"&gt;use wikis in archives&lt;/a&gt; and built a model of Sputnik that contained a microcontroller, a thermistor to sense temperature changes and an accelerometer to respond to motion.  The information about the model's state was conveyed by modulating the frequency and duration of a beeping signal.  Devon did the programming and electronics without any help from me, so I knew he would be the perfect collaborator for the geo-DJ project.&lt;br /&gt;&lt;br /&gt;The &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/12/geo-dj-part-1-idea.html"&gt;geo-DJ&lt;/a&gt; is a wearable iPod-like device.  As you wander around a present-day environment, it uses GPS to determine your position and synthesizes an electronic soundtrack that reflects former land-use patterns.  Creating something like this wouldn't be too difficult using a lightweight laptop or a powerful handheld computer running GIS software.  But we're interested in doing the project at as low a level as possible, preferably using an open source microcontroller board like &lt;a href="http://www.arduino.cc/"&gt;Arduino&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;In the history of computing, people often faced the limits of both memory capacity and processing speeds.  Consider the problem of determining trigonometric functions for particular values.  There are algorithms for computing the &lt;a href="http://mathworld.wolfram.com/Sine.html"&gt;sine&lt;/a&gt; of an angle, but they're complicated.  Before the widespread adoption of digital calculators it was common for people to use trig tables, a clear case of using more storage space to simplify or speed up calculation.  With digital calculators or general-purpose computers, it is simpler and faster to punch in the calculation than to look it up in a trig table.  But here is the tricky part: it may not be simpler for the computer to do the computation.  The software may involve looking up the value of various trig functions in tables, even though that is not apparent to the user.&lt;br /&gt;&lt;br /&gt;Doing the geo-DJ project on a small computer like Arduino approaches these limits in (at least) two places: GIS and music synthesis.  In the case of the GIS, we want to know the person's distance from the various points, lines and polygons that are used to represent historical features of interest.  There are &lt;a href="http://www.geog.ubc.ca/courses/klink/gis.notes/ncgia/toc.html"&gt;algorithms&lt;/a&gt; for computing these measures, but our processor is slow and our application requires real-time feedback.  It might make more sense to pre-compute the measures and store the information about distances in a multi-dimensional array.  Of course, the basic amount of memory on an Arduino is also very limited, so we have to find the optimal balance.  In the case of music synthesis, a similar problem arises.  Sounds have complicated waveforms which can be computed or looked up in a wave table.  Once again, we will have to find the right balance between storage and computation.&lt;br /&gt;&lt;br /&gt;It may be that the platform that we're trying to use is too simple.  We may have to add more memory, or dedicated signal processing hardware, or both.  But that is one of the things that makes a project like this fun.  By working close to computational limits we not only have more of a challenge, but more of a sense what computing used to be like, long ago, when we were kids.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/electronica" rel="tag"&gt;electronica&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historical+consciousness" rel="tag"&gt;historical consciousness&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/place" rel="tag"&gt;place&lt;/a&gt; | &lt;a href="http://technorati.com/tag/place+based+computing" rel="tag"&gt;place-based computing&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3284534639175065867?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3284534639175065867'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3284534639175065867'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/05/geo-dj-part-2-storage-vs-computation.html' title='Geo-DJ, Part 2: Storage vs. Computation'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-6502128848336780934</id><published>2008-05-10T10:29:00.008-04:00</published><updated>2008-05-10T11:25:13.769-04:00</updated><title type='text'>Beginning in the Middle</title><content type='html'>For the past few summers, I've been taking on talented students to work on digital stuff.  Rather than giving them a canned project or expecting anything in particular to happen, I usually give them a difficult problem and then step back.  The results have been very encouraging, especially since I tend to choose independent students who are OK with my laissez faire approach.&lt;br /&gt;&lt;br /&gt;One of the people who is working with me this summer is &lt;a href="http://adamcrymble.blogspot.com/"&gt;Adam Crymble&lt;/a&gt;.  Last year he managed to come up with &lt;a href="http://adamcrymble.blogspot.com/2007/11/how-to-get-feedback-from-your-exhibit.html"&gt;a low-tech public history hack&lt;/a&gt;, make some &lt;a href="http://adamcrymble.blogspot.com/2008/04/stonehenge-videos.html"&gt;3D animations&lt;/a&gt;, and teach himself enough HTML and CSS to hand code a &lt;a href="http://publish.uwo.ca/~acrymble/Chapter_1.html"&gt;web page&lt;/a&gt;.  So for a summer project I suggested he try and write some translators for &lt;a href="http://www.zotero.org/"&gt;Zotero&lt;/a&gt;.  He doesn't have any training for this, and I am of limited assistance since I don't really know JavaScript.  Sink or swim, buddy!&lt;br /&gt;&lt;br /&gt;Adam intuitively started where I would.  He printed out all the code and documentation that he could get his hands on, then started using colored highlighters to focus his attention on the parts that he could understand.  He also used Wikipedia, the W3 Schools, and our library's Safari subscription to O'Reilly books online.  In the space of a couple of weeks, he's made great progress and learned enough so that I'm still of no use to him.&lt;br /&gt;&lt;br /&gt;Reading other people's code is always hard, but it is one of the best ways to learn how to program.  As Abelson and Sussman write in &lt;span style="font-style:italic;"&gt;Structure and Interpretation of Computer Programs&lt;/span&gt;, "a computer language is not just a way of getting a computer to perform operations but rather ... a novel formal medium for expressing ideas about methodology.  Thus, programs must be written for people to read, and only incidentally for machines to execute."  The beginning programmer starts out much like a child who is acquiring a natural language: immersed in a medium produced by people who are already fluent.&lt;br /&gt;&lt;br /&gt;Historians have a secret advantage when it comes to learning technical material like programming: we are already used to doing close readings of documents that are confusing, ambiguous, incomplete or inconsistent.  We all sit down to our primary sources with the sense that we &lt;span style="font-style:italic;"&gt;will&lt;/span&gt; understand them, even if we're going to be confused for a while.  This approach allows us to eventually produce learned books about subjects far from our own experience or training.&lt;br /&gt;&lt;br /&gt;I believe in eating my own dogfood, and wouldn't subject my students to anything I wouldn't take on myself.  As my own research and teaching moves more toward desktop fabrication, I've been reading a lot about materials science, structural engineering, machining, CNC and other subjects for which I have absolutely no preparation.  It's pretty confusing, of course, but each day it all seems a little more clear.  I've also been making a lot of mistakes as I try to make things.  As humanists, I don't think we can do better than to follow Terence's &lt;a href="http://www.bartleby.com/66/81/57681.html"&gt;adage&lt;/a&gt; that nothing human should be alien to us.  It is possible to learn anything, if you're willing to begin in the middle.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historiography" rel="tag"&gt;historiography&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interdisciplinarity" rel="tag"&gt;interdisciplinarity&lt;/a&gt; | &lt;a href="http://technorati.com/tag/learning" rel="tag"&gt;learning&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-6502128848336780934?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6502128848336780934'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6502128848336780934'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/05/beginning-in-middle.html' title='Beginning in the Middle'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-6649615285390313010</id><published>2008-05-04T09:03:00.004-04:00</published><updated>2008-05-04T09:27:16.272-04:00</updated><title type='text'>The Programming Historian is Now Available</title><content type='html'>&lt;span style="font-style:italic;"&gt;&lt;a href="http://niche.uwo.ca/programming-historian/"&gt;The Programming Historian&lt;/a&gt;&lt;/span&gt; is now available on the NiCHE: Network in Canadian History &amp;amp; Environment website.  This work is an open-access introduction to programming in Python, aimed at working historians (and other humanists) with little previous experience.  Introductory lessons teach you how to&lt;br /&gt;&lt;ul&gt;&lt;li&gt;install &lt;a href="http://zotero.org"&gt;Zotero&lt;/a&gt;, the &lt;a href="http://python.org"&gt;Python&lt;/a&gt; programming language and other useful tools&lt;/li&gt;&lt;li&gt;read and write data files&lt;/li&gt;&lt;li&gt;save web pages and automatically extract information from them&lt;/li&gt;&lt;li&gt;count word frequencies&lt;/li&gt;&lt;li&gt;remove stop words&lt;/li&gt;&lt;li&gt;automatically refine searches&lt;/li&gt;&lt;li&gt;make n-gram dictionaries&lt;/li&gt;&lt;li&gt;create keyword-in-context (KWIC) displays&lt;/li&gt;&lt;li&gt;make tag clouds, and&lt;/li&gt;&lt;li&gt;harvest sets of hyperlinks&lt;/li&gt;&lt;/ul&gt;&lt;span style="font-style:italic;"&gt;The Programming Historian&lt;/span&gt; is a work-in-progress.  We are constantly adding new material, much of it driven by reader request.  Upcoming topics will include indexing, scraping projects, simple spiders, mashups and much more.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/browser" rel="tag"&gt;browser&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+access" rel="tag"&gt;open access&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+source" rel="tag"&gt;open source&lt;/a&gt; | &lt;a href="http://technorati.com/tag/programming" rel="tag"&gt;programming&lt;/a&gt; | &lt;a href="http://technorati.com/tag/Python" rel="tag"&gt;Python&lt;/a&gt; | &lt;a href="http://technorati.com/tag/zotero" rel="tag"&gt;Zotero&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-6649615285390313010?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6649615285390313010'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6649615285390313010'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/05/programming-historian-is-now-available.html' title='The Programming Historian is Now Available'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-781503639973614623</id><published>2008-04-10T10:32:00.025-04:00</published><updated>2008-04-10T12:32:25.576-04:00</updated><title type='text'>Fitness Functions</title><content type='html'>[Cross-posted to Cliopatria &amp;amp; Digital History Hacks]&lt;br /&gt;&lt;br /&gt;One of the distinctions that applied mathematicians make is between linear and nonlinear problems.  In a linear problem, you have a set of variables that you can tweak, and as you adjust each variable you can get ever closer to an optimal configuration.  Using techniques such as &lt;a href="http://en.wikipedia.org/wiki/Linear_programming"&gt;linear programming&lt;/a&gt;, it is straightforward to determine precisely how many scoops of raisins to put in your box of bran, or how many Cherries will make a Garcia.  Many problems, alas, don't admit of this kind of solution.  In the days before digital everything, it was all too common to futz around with the brightness knob, color balance, rabbit ears, and position of pets and small children to try and get a TV signal that didn't look like it was being relayed from the dark side of the moon.  The slightest change could make things drastically better or worse, with no apparent logic.&lt;br /&gt;&lt;br /&gt;The problem with nonlinear problems is that you pretty much have to get every variable right at the same time.  Think of the space of all possible states of your problem as a kind of dark landscape, and the optimal solution as the highest point in that space.  Linear problems have smooth landscapes.  If you start groping your way up a hill, you end up at the top and that's the best you can do overall.  Nonlinear problems have jagged landscapes.  It is easy to feel your way up a low peak and get stuck there, unaware of higher peaks elsewhere.&lt;br /&gt;&lt;br /&gt;There are different methods for solving nonlinear optimization problems; one of the more popular makes use of &lt;a href="http://www.aaai.org/AITopics/pmwiki/pmwiki.php/AITopics/GeneticAlgorithms"&gt;genetic algorithms&lt;/a&gt;.  First you find a way of representing all of the possible solutions to your problem.  In the TV example, you might want to represent the angle of each of the two antennas, the &lt;span style="font-style:italic;"&gt;xy&lt;/span&gt; coordinates of the napping cat, the rotational angle of the brightness knob, and so on.  A list of each of these variables is known as a &lt;span style="font-style:italic;"&gt;genome&lt;/span&gt;, and a list of particular values as a &lt;span style="font-style:italic;"&gt;genotype&lt;/span&gt;.  Generate a small random population of genotypes, and test each one to see how good it is.  This test is called the &lt;span style="font-style:italic;"&gt;fitness function&lt;/span&gt;.  In our example, it is the person sitting on the couch shouting "not bad," "pretty good" or "awful" each time an adjustment is made.  Once you know how well each of your solutions performed, you make a new generation of solutions by mutating and recombining the genomes of your old ones.  Over time, the fitness of the population increases, and the artificial selection mechanism eventually finds solutions that are near optimal. (If you want to start programming your own GAs, I recommend Mitchell's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Introduction-Genetic-Algorithms-Complex-Adaptive/dp/0262631857/"&gt;Introduction&lt;/a&gt;&lt;/span&gt; and Goldberg's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Genetic-Algorithms-Optimization-Machine-Learning/dp/0201157675/"&gt;Genetic Algorithms&lt;/a&gt;&lt;/span&gt; as good places to start).&lt;br /&gt;&lt;br /&gt;One of the perennial tragedies of academia is that we constantly pretend that our careers or those of our students are linear optimization problems.  Grades are the most obvious way that we do this.  Students learn that their mark on one test is independent of their mark on another, that it is better to have a high GPA than to risk taking hard courses that interest them, that exploration and failure will usually be punished.  Teachers justify marks by appealing to rubrics, bemoaning grade inflation and students "who look good on paper."  Too many of us think of a good career in terms of lines on a CV, a list of so many independent accomplishments, each of which can be attained and then forgotten.&lt;br /&gt;&lt;br /&gt;On a rainy day in 1992, I wandered into a Vancouver technical bookstore on my way home from school.  I think I was probably avoiding a problem set or some other homework, as I've never been very good at doing what I should be doing rather than what I want to be doing.  Anyway, I remember finding a copy of John Holland's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Adaptation-Natural-Artificial-Systems-Introductory/dp/0262581116/"&gt;Adaptation in Natural and Artificial Systems&lt;/a&gt;&lt;/span&gt; on the shelf of new releases and really wanting to buy it.  I stood in the store holding the book for the longest time.  It was more than I could afford, it was a distraction from my school work, I had a bad habit of buying books and losing interest in them.  I had been doing a lot of exploring and a fair bit of failing.  I finally made the decision that was, in context at least, sub-optimal.  I bought the book and went home to read it rather than doing my schoolwork.&lt;br /&gt;&lt;br /&gt;I often tell my students that they should follow their curiosity, take chances and not be afraid to fail.  You never really know what whim, what chance encounter or distraction is going to change your life.  In my case, I read a lot of science fiction and graphic novels and ate a lot of guacamole.  I played role playing games and got married early and happily.  I watched TV.  I got bad grades in linear algebra and analysis, but I liked math enough to keep trying until I got better at it.  And my first published work was on a subject that was novel and trendy enough that my reputation as an up-and-coming researcher outweighed my uneven transcript: genetic algorithms.  It's tempting to look back at that moment in the bookstore as a crucial inflection point in my life, but that would be too linear.  The choices that we make affect our fitness, but never in a way that makes it easy to assign credit or blame.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/feedback" rel="tag"&gt;feedback&lt;/a&gt; | &lt;a href="http://technorati.com/tag/genetic+algorithms" rel="tag"&gt;genetic algorithms&lt;/a&gt; | &lt;a href="http://technorati.com/tag/nonlinear+optimization" rel="tag"&gt;nonlinear optimization&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-781503639973614623?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/781503639973614623'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/781503639973614623'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/04/fitness-functions.html' title='Fitness Functions'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1679870318115251054</id><published>2008-04-05T08:25:00.047-04:00</published><updated>2008-12-29T19:34:46.352-05:00</updated><title type='text'>Visualizing the Emergence of a Strategic Knowledge Cluster</title><content type='html'>In the summer of 2004, when I had just arrived at the University of Western Ontario, my new colleague &lt;a href="http://history.uwo.ca/faculty/maceachern/"&gt;Alan MacEachern&lt;/a&gt; invited me to join a small group that was putting together a grant application.  The federal agency &lt;a href="http://www.sshrc.ca/"&gt;SSHRC&lt;/a&gt; had just announced funding for the design of something called 'research clusters'.  At the time none of us was particularly clear what these clusters were supposed to be, and like many of the best kinds of opportunity, I don't think that SSHRC was really clear either.  We eventually settled on the idea that the main task of clusters was 'knowledge mobilization', which left the matter nicely open.&lt;br /&gt;&lt;br /&gt;Our initial grant application was successful, and five of us set to work to develop &lt;a href="http://niche.uwo.ca"&gt;NiCHE&lt;/a&gt;, the Network in Canadian History &amp;amp; Environment / Nouvelle initiative canadienne en histoire de l'environnement.  As we tried various things we kept track of activities and participants, allowing us to visualize the emergence of our research network.  I should say up front that NiCHE doesn't &lt;span style="font-style:italic;"&gt;cause&lt;/span&gt; research and is prohibited from directly funding research per se.  Instead we find ways to facilitate research and training in environmental history broadly construed, and to mobilize the knowledge that researchers create.&lt;br /&gt;&lt;br /&gt;One of the tools that we use for visualization is an open source package called &lt;a href="http://www.graphviz.org/"&gt;Graphviz&lt;/a&gt;.  We create a file that specifies entities (people, publications, field trips, etc.) and the relationships between them, then we hand off that file to Graphviz, which uses sophisticated algorithms to figure out a neat way to plot the network.  We've found such visualization to be very useful, even though it can only ever show the tip of a much larger social iceberg.  In our graphs, two people may be linked because they attended the same meeting or each published a chapter in a book.  Our data doesn't show whether they knew each other in grad school, have a longstanding rivalry, or both secretly like &lt;span style="font-style:italic;"&gt;Buffy the Vampire Slayer&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The original NiCHE executive group worked quite closely together.  One of the interesting facts about networks is that the number of possible pairwise relations between entities grows much faster than the number of entities as the network gets larger.  Two people have at most one relationship, three people can have three (AB, BC, AC), four people can have six (AB, AC, AD, BC, BD, CD).  The ten possible pairwise relationships between the five of us looked like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlrYV1P6hI/AAAAAAAAALQ/DRnosj5gOEU/s1600-h/generalized-network-growth-200408.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 150px; height: 152px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlrYV1P6hI/AAAAAAAAALQ/DRnosj5gOEU/s200/generalized-network-growth-200408.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285373703617505810" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;One of the first things that we tried to do was provide licenses for Groove collaborative software to all of the people who were interested in joining NiCHE.  For people with Windows machines the software worked very well.  Unfortunately, it never really worked for people with Macs.  We had to supplement Groove with other software, find suboptimal workarounds, and eventually abandon it.  For a while, however, it gave us a way to interact relatively closely with NiCHE members who also happened to be tech-savvy Windows users.  Our network took on a hub-and-spoke form.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlre01MCnI/AAAAAAAAALY/Us5SEUyrD34/s1600-h/generalized-network-growth-200411.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 199px; height: 200px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlre01MCnI/AAAAAAAAALY/Us5SEUyrD34/s200/generalized-network-growth-200411.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285373815017966194" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To reach out to more potential participants, we formed an advisory group and held a meeting in Toronto.  Instead of one hub, we now had two, with some bridging members who participated in both online and face-to-face activities.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlrlzpfUuI/AAAAAAAAALg/Yc-PN-3bepY/s1600-h/generalized-network-growth-200412.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 178px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlrlzpfUuI/AAAAAAAAALg/Yc-PN-3bepY/s200/generalized-network-growth-200412.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285373934959547106" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The executive group split up to host regional meetings in other cities across Canada.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlrtUaK46I/AAAAAAAAALo/27wCJaT4RCs/s1600-h/generalized-network-growth-200501.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 162px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlrtUaK46I/AAAAAAAAALo/27wCJaT4RCs/s200/generalized-network-growth-200501.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285374064012747682" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;We put together an online directory so members could add information about themselves.  The directory allowed us to contact people and tell them about upcoming activities.  Since it was publicly accessible, the directory also allowed NiCHE members to learn more about one another.  &lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlsliiS99I/AAAAAAAAAMI/hGa0yLf75r0/s1600-h/generalized-network-growth-200503.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 172px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlsliiS99I/AAAAAAAAAMI/hGa0yLf75r0/s200/generalized-network-growth-200503.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285375029877602258" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Although adding one's name to a directory is a relatively weak form of participation,  we found that many people became more active in NiCHE over time.  The network seemed to extend to new participants, many of whom would then get involved in a number of subsequent projects.  There is a saying in free / open source software, "contribute nothing, expect nothing."  Conversely we could say that the people who contributed something to NiCHE could expect something from us.  Some of them contributed articles to a &lt;a href="http://niche.uwo.ca/node/157"&gt;special issue&lt;/a&gt; of the journal &lt;span style="font-style:italic;"&gt;Environmental History&lt;/span&gt;.  Some contributed chapters to a new textbook, &lt;span style="font-style:italic;"&gt;&lt;a href="http://hed.nelson.com/nelsonhed/instructor.do?pagefrom=search&amp;disciplinenumber=21&amp;product_isbn=9780176441166"&gt;Method and Meaning in Canadian Environmental History&lt;/a&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlr2n_ZG0I/AAAAAAAAALw/6g5vHgR2I10/s1600-h/generalized-network-growth-200505.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 172px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlr2n_ZG0I/AAAAAAAAALw/6g5vHgR2I10/s200/generalized-network-growth-200505.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285374223887964994" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Subsequent activities like a summer school and a graduate student workshop brought in some new participants, and brought back many more: &lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlsDwygCbI/AAAAAAAAAL4/itVTAJBU7lM/s1600-h/generalized-network-growth-200509.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 172px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlsDwygCbI/AAAAAAAAAL4/itVTAJBU7lM/s200/generalized-network-growth-200509.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285374449588111794" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlsRLt7dBI/AAAAAAAAAMA/8KUE5cj0tUA/s1600-h/generalized-network-growth-200611.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 172px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlsRLt7dBI/AAAAAAAAAMA/8KUE5cj0tUA/s200/generalized-network-growth-200611.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285374680154993682" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;When SSHRC announced a much larger grant for strategic knowledge clusters, we were able to include a version of the last figure as part of our application.  (The Graphviz script that generated it is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/generalized-network-growth.dot.html"&gt;here&lt;/a&gt;.) &lt;br /&gt;&lt;br /&gt;A year and half later, we're in the process of scaling up NiCHE activities by a couple of orders of magnitude.  Network visualization gives us some insight into the work of a few hundred people who are loosely affiliated with NiCHE and collaborating in many different ways.  We can identify people who have energy and initiative to share, and try to help them.  Some provide 'bonding capital', tying tightly-linked groups closer together.  Some provide 'bridging capital', mobilizing knowledge from one region or disciplinary specialization to another.  We can also be more strategic about developing the connections that still need to be made, to make our network stronger and more effective.  (For more about social networks, see Clay Shirky's new &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Here-Comes-Everybody-Organizing-Organizations/dp/1594201536/"&gt;Here Comes Everybody&lt;/a&gt;&lt;/span&gt;.)&lt;br /&gt;&lt;br /&gt;What is more exciting is that we are getting closer to the point where we can make these kind of tools available to everyone in NiCHE.  People will be able to enter their own information about research collaborations and interests, and explore social connections within the network.  It will become much easier to find joint acquaintances to make introductions or to find people with particular skills or expertise.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/graphviz" rel="tag"&gt;Graphviz&lt;/a&gt; | &lt;a href="http://technorati.com/tag/social+network+analysis" rel="tag"&gt;social network analysis&lt;/a&gt; | &lt;a href="http://technorati.com/tag/sshrc" rel="tag"&gt;SSHRC (Social Sciences and Humanities Research Council of Canada)&lt;/a&gt; | &lt;a href="http://technorati.com/tag/visualization" rel="tag"&gt;visualization&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1679870318115251054?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1679870318115251054'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1679870318115251054'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/04/visualizing-emergence-of-strategic.html' title='Visualizing the Emergence of a Strategic Knowledge Cluster'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlrYV1P6hI/AAAAAAAAALQ/DRnosj5gOEU/s72-c/generalized-network-growth-200408.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-6808735410862777102</id><published>2008-03-25T09:56:00.002-04:00</published><updated>2008-03-25T10:45:19.950-04:00</updated><title type='text'>Monitoring the Backchannel</title><content type='html'>&lt;a href="http://www.robmacdougall.org/"&gt;Rob MacDougall&lt;/a&gt; and I are putting together something new and fun for &lt;a href="http://history.uwo.ca/"&gt;Western&lt;/a&gt; freshmen this coming fall, a course called "Science, Technology and Global History."  Our goals are modest.  We hope to cover the history of the whole enchilada from the Big Bang to the near future, while inculcating the idea that historians and scientists both need to have the same kind of critical, evidence-based habits of thought.  Forget the two cultures.  While Rob is figuring out how our students can work in teams online with students in South Asia, I'm left to kick back and brainstorm classroom mischief.&lt;br /&gt;&lt;br /&gt;One of the interesting things about first year courses at our university is that the enrollment can't be capped.  So we could have six students or six hundred.  I've done large lectures before, and I'm not very enthusiastic about the format.  I try to wave my hands a lot, because I once attended a seminar by a psychologist who studies the teaching evaluation process and he said that students rank mobile professors more highly than sessile ones.  I also stopped talking every ten minutes or so to give students a chance to ask questions, but most of them seemed pretty shy.  Each term, I got to know the half-dozen who did like to speak up in class.&lt;br /&gt;&lt;br /&gt;Since I teach with a laptop and LCD projector, I've been thinking it would be fun to have a chat window running so students could provide backchannel commentary that could be seen by all.  This might be something like IM or &lt;a href="http://twitter.com"&gt;Twitter&lt;/a&gt;.  As I was talking, I could keep an eye on the chat window and field questions that would take the class somewhere interesting.  If there was a sudden storm of confusion, I could go back and unpack or repeat something.  Students who read my blog could even try to amuse me by setting loose &lt;a href="http://digitalhistoryhacks.blogspot.com/2008/03/lunchtime-chat.html"&gt;chatterbots&lt;/a&gt; that simulate famous historical figures.  Now I suspect that some of you might be worrying that a few students would abuse the system and type obscenities or whatever.      But I'm not worried, because I can always walk over to the computer and close the chat window.  It's that easy.  I figure that if you treat people like adults they respond in kind.&lt;br /&gt;&lt;br /&gt;I'd be happy to hear from anyone who has tried something like this.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/feedback" rel="tag"&gt;feedback&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt; | &lt;a href="http://technorati.com/tag/twitter" rel="tag"&gt;Twitter&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-6808735410862777102?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6808735410862777102'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6808735410862777102'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/03/monitoring-backchannel.html' title='Monitoring the Backchannel'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5981763792737459401</id><published>2008-03-23T10:41:00.004-04:00</published><updated>2008-03-23T11:22:09.669-04:00</updated><title type='text'>A Lunchtime Chat</title><content type='html'>There is a question that I'm told is popular to ask incoming freshmen: "Which historical figure (Jesus, Gandhi, Ozzy, etc.) would you most like to have lunch with and why?"  Now I have no idea what quality in the student this question is supposed to elicit, except perhaps forbearance.  I'm glad that no one ever tried it out on me, because most of the answers that occur to me--"Is that likely to happen if I decide to attend this school, sir?"--probably wouldn't help my case.  When the list of candidates is specified in advance, they're typically chosen either because they are (in)famous icons of recent pop culture or because they are timeless sages who have already provided written answers to the most common set of meaning-of-life-style questions.  As much as I might rather meet Lao Tzu than Elvis, my hunch is that it would be more in keeping with Taoist principles to dine with someone who speaks your language and shares your preference for Southern fried cooking.  I could be wrong about that.&lt;br /&gt;&lt;br /&gt;The whole dining with the stars thing puts me in mind of the Turing test.  Alan Turing famously argued that we'd know that a computer was intelligent when its conversational interaction was indistinguishable from a person.  Because people and computers look differently (&lt;a href="http://www.imdb.com/title/tt0083658/"&gt;android fantasies&lt;/a&gt; notwithstanding) he suggested a situation that would cloak the embodiment of the interlocutor.  The person who is conducting the test takes turns asking questions of two different respondents via a low-bandwidth connection (think IM).  If he or she can tell which one is the computer, it fails the Turing test.&lt;br /&gt;&lt;br /&gt;In 1966, Joseph Weizenbaum created a conversational program called Eliza.  Eliza could read an incoming statement like "I hate dogs" and use simple transformational grammar to turn it into a question "Why do you hate dogs?"  It could offer noncommittal responses like "Please go on."  If the person answered a question with "Yes," Eliza might say "You seem positive."  Many people interacted with Eliza enthusiastically, leading some to say the Turing Test had already been passed and others to say that it was rubbish.  (If you'd like to converse with Eliza you can Google for one of her many incarnations.)&lt;br /&gt;&lt;br /&gt;If I were chatting with freshmen, say over lunch, I'd be looking for students who had heard of Eliza and the Turing test and had a well-developed sense of anachronism.  That hasn't happened to me yet.  As a public service, I'm going to offer a new question that has been updated for the digital humanities: "What challenges would you encounter when trying to create an Eliza-style simulation of each of the following historical figures?  Which would be most or least likely to pass a Turing test and why?"&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/rtfm" rel="tag"&gt;RTFM&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5981763792737459401?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5981763792737459401'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5981763792737459401'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/03/lunchtime-chat.html' title='A Lunchtime Chat'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1930528473393123201</id><published>2008-03-10T09:14:00.008-04:00</published><updated>2008-03-10T10:04:10.417-04:00</updated><title type='text'>Pupation</title><content type='html'>Every so often in the past few decades I've had to go through my accumulated collections of code and text and &lt;a href="http://en.wikipedia.org/wiki/Binary_file"&gt;binaries&lt;/a&gt; and try to translate them so that they could be used on a new platform or new version of an operating system.  In some cases, such as text files, it's always been quite easy.  In others, it has been more difficult, or even impossible.  The assembly language that I wrote for one chip, for example, won't run on any other.  The KnowledgeMan database programming that I did in the 1980s dates me, but otherwise isn't of much use now.  More poignantly, KMan doesn't even have its own page in Wikipedia.  Now I'm in the process of moving all of my files to an open source revision-control system (more on that in a later post) and face many familiar problems.  Once again, I'm discovering that open formats are a &lt;span style="font-style:italic;"&gt;really good idea&lt;/span&gt;, and that in thirty years--if I last that long--the only sources that I will have to look back on my work right now may be text, XML and source code.&lt;br /&gt;&lt;br /&gt;As I go through my files this time around, however, there are a lot of notes from writing my dissertation and publishing it.  I'm reminded that I've created a few new careers by metabolizing a succession of older ones and metamorphosing into something different.  And when I look through my archival notes and book notes and lists of ideas and questions, I see that most of my work didn't end up in the &lt;a href="http://www.amazon.com/Archive-Place-Unearthing-Chilcotin-Plateau/dp/0774813776/"&gt;published book&lt;/a&gt;.  Some of it was tangential, some was forgotten, some better forgotten.&lt;br /&gt;&lt;br /&gt;I'm thinking a lot about the computational tools that historians might use to write different kinds of history.  In methodological guides, the emphasis is always on keeping track of things, on proper notetaking and proper citation, so that you don't forget where something came from.  Working with digitized sources makes it much easier to search and cite and archive, and easier to imagine that almost everything can be saved.  But what if some projects are crucially dependent on a period of forgetting and reuse?  What kind of tool would allow some sources to be lost, remake your tangents into something new, turn your caterpillar into a butterfly or a moth?&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historical+consciousness" rel="tag"&gt;historical consciousness&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historiography" rel="tag"&gt;historiography&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+formats" rel="tag"&gt;open formats&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1930528473393123201?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1930528473393123201'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1930528473393123201'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/03/pupation.html' title='Pupation'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-306819096206748799</id><published>2008-03-06T10:36:00.006-05:00</published><updated>2008-03-06T10:59:30.621-05:00</updated><title type='text'>A Cure for Continuous Partial Attention</title><content type='html'>On my way home the other night I noticed that the lead story in one of the university student newspapers was headlined "Frustrated profs consider laptop ban."  This is one of those perennial favorites.  Students seem distracted?  Cut off their wireless, ban laptops and smart phones, and forbid internet use for coursework.  After all, everyone knows that students always paid respectful attention to their teachers before computer and wireless internet use became widespread.  The part of the article that made me laugh the hardest was a quote from an anonymous professor who complained that one student was typing into a laptop furiously for no reason.  How hard must that class suck, if the &lt;span style="font-style:italic;"&gt;prof&lt;/span&gt; thinks that nothing noteworthy was going on?  And wouldn't you feel stupid if your inattentive student was brainstorming a cure for cancer?  For their part, the students interviewed for the story mostly seemed to think that laptop use was actually helping them to learn and to prepare for their futures.&lt;br /&gt;&lt;br /&gt;Really, shouldn't we be worried about the digital divide, rather than trying to exacerbate it?  As Manuel Castells argues in &lt;span style="font-style:italic;"&gt;The Internet Galaxy&lt;/span&gt;, a lack of access to networked devices is only one part of the problem.  One of the fundamental challenges for a network society is&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;the installation of information-processing and knowledge-generation capacity in every one of us--and particularly in every child.  By this I obviously do not mean literacy in using the Internet in its evolving forms (this is presupposed).  I mean education.  But in its broader, fundamental sense; that is, to acquire the intellectual capacity of learning to learn throughout one's whole life, retrieving the information that is digitally stored, recombining it, and using it to produce knowledge for whatever purpose we want.  This simple statement calls into question the entire education system developed during the industrial era. (277-78)&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;A student's freedom to think their own thoughts, to structure their own mental activity, is a far greater good than trying to compel some semblance of attention.  So here's a suggestion for all you frustrated profs: relax.  I'm guessing that you may have spent some of your own undergraduate hours daydreaming, doodling or writing snarky notes in the margins of your notebooks.  And look how well you turned out!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-306819096206748799?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/306819096206748799'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/306819096206748799'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/03/cure-for-continuous-partial-attention.html' title='A Cure for Continuous Partial Attention'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7791328243982782167</id><published>2008-02-17T15:54:00.004-05:00</published><updated>2008-02-17T16:24:30.004-05:00</updated><title type='text'>OSes/2</title><content type='html'>In my previous post I described a tangible interface that I made for Google Earth, using Arduino and Python.  A couple of days after that, I had the chance to take my prototype in to school to try and get it running on one of the student's laptops.&lt;br /&gt;&lt;br /&gt;I had pretty high hopes.  She had a beautiful new machine.  It had only taken me a few minutes to move the project from one of my Win XP computers to another.  I had even burned a disk with installation files of all of the software I'd need.  She turned on her computer and we waited.  And waited.  I remembered how disappointing it was the first time I tried Windows 1.0 on my DOS machine.  We waited.  I remembered how long it had taken for my Win 95 and Win 98 machines to boot.  We waited.  We made small talk.  I asked her how long she'd had her computer (less than a year).  I asked her if she liked it.  She said plaintively, "I think I want a Mac."  We waited some more.&lt;br /&gt;&lt;br /&gt;Things went downhill from there.  I spent about an hour and a half trying to install my software.  Every few minutes, the screen would darken and I would get a security message.  Occasionally, a window would open with a long list of processes that needed to be killed.  I would then hunt them down one-by-one, try to figure out what they did, and stop them.  Unfortunately, the process IDs weren't very useful because they don't appear uniformly in the different tabs of the default display of the task manager.  Once in a while I would get an error message with no way to rectify the situation, other than to accept it.  Eventually I got to a point where it seemed like I was going to damage something, so I spent another hour trying to undo my earlier actions.  Up until a few days ago, I hadn't seen Vista, but had assumed that it couldn't be as bad as the "Hi, I'm a Mac--And I'm a PC" ads made it sound.  I took a quick poll of the students and found that about a fifth of them have Macs, and the rest have new Vista machines.  I decided to bring in a few old machines running XP to use for their exhibits this year.&lt;br /&gt;&lt;br /&gt;I had a lot of time to think while I was sitting there.  I have almost $200K to spend on computers for myself, my colleagues and students over the next few years.  I had assumed that I'd be buying a few Linux machines for power users and Windows machines for the rest.  I just can't see that happening now.  I'm beginning to think that the non-Linux machines in our new computer labs might be more useful for everyone if they have &lt;a href="http://www.apple.com/macosx/"&gt;an OS that is built on top of Unix&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/suddenoutbreakofcommonsense" rel="tag"&gt;suddenoutbreakofcommonsense&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7791328243982782167?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7791328243982782167'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7791328243982782167'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/02/oses2.html' title='OSes/2'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3735638996140859338</id><published>2008-02-11T10:29:00.001-05:00</published><updated>2008-12-29T19:23:50.361-05:00</updated><title type='text'>Prototyping a Tangible Interface for Google Earth</title><content type='html'>As I mentioned in a &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/11/physical-computing-cards.html"&gt;previous post&lt;/a&gt;, students in my digital history grad class this year are working in teams to create interactive exhibits that involve physical computing.  Since none of the students have much prior experience with programming or electronics, I'm providing a bit of the scaffolding to help them realize their designs.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;The design&lt;/span&gt;.  One of the groups imagined an interface consisting of a handheld globe.  As you touch a point on the surface of the globe, a computer display responds by orienting a corresponding digital globe to focus on that place, and then opens some panels with information about an event that happened there.  They decided that &lt;a href="http://earth.google.com/"&gt;Google Earth&lt;/a&gt; would make a good software platform.  I left them to the task of creating their exhibit materials in the XML-based Keyhole Markup Format (&lt;a href="http://earth.google.com/userguide/v4/ug_kml.html"&gt;KML&lt;/a&gt;) that Google Earth uses.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Hardware&lt;/span&gt;.  The exhibit will be mounted on a laptop running Windows.  For the tangible interface we're using an &lt;a href="http://arduino.cc/"&gt;Arduino&lt;/a&gt; microcontroller board.  Normally-open pushbutton switches are connected to the digital inputs on the Arduino with 10K pull-down resistors.  We debounced the switches in software, by reading their value twice at 10ms intervals.  The Processing program that runs on the Arduino is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/google_earth.pde.html"&gt;here&lt;/a&gt;.  It maintains the state of the last button pressed, and sends it repeatedly over a serial connection to the PC.  (If you'd like to try making something like this yourself, &lt;a href="http://www.ladyada.net/learn/arduino/"&gt;Lady Ada's Arduino Tutorial&lt;/a&gt; is a great place to start).&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlp9U4w3uI/AAAAAAAAALI/LJ_wjM7_hGM/s1600-h/google-earth-arduino.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlp9U4w3uI/AAAAAAAAALI/LJ_wjM7_hGM/s200/google-earth-arduino.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285372139995717346" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Python glue&lt;/span&gt;. We needed a program to sit in between the Arduino and Google Earth, collecting information from the former and using it to control the latter.  Python is ideal for this.  First we installed the &lt;a href="http://sourceforge.net/projects/pywin32/"&gt;Python for Windows&lt;/a&gt; extensions and the &lt;a href="http://pyserial.sourceforge.net/"&gt;Python Serial Library&lt;/a&gt;.  We were then able to control Google Earth via the &lt;a href="http://earth.google.com/comapi/index.html"&gt;COM API&lt;/a&gt;.  Our Python test program is &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/google-earth.py.html"&gt;here&lt;/a&gt;.  It first defines two functions, one to orient Google Earth to the main gates of the University of Western Ontario, and one to orient it to Uluru (Ayers Rock) in Australia.  The program then initializes the serial port and Google Earth.  Finally it enters an infinite loop, reading the serial port and calling one of the two navigation functions when the corresponding button is pressed.  (If you'd like to use Python to control Google Earth, Fran&amp;ccedil;ois Schnell has a &lt;a href="http://docs.google.com/View?docid=dgqhgsgm_933rjw93&amp;pli=1"&gt;very useful page&lt;/a&gt;.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Testing&lt;/span&gt;.  I was quite impressed with how crisply the whole system works together.  To get it running, you go through the following steps.&lt;br /&gt;&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Build and test the button circuit, then connect it to the Arduino.&lt;/li&gt;&lt;li&gt;Start the Arduino software on the PC.&lt;/li&gt;&lt;li&gt;Compile the Arduino program and download it to the board.&lt;/li&gt;&lt;li&gt;Run the Python program.  It will start Google Earth automatically in full screen mode.&lt;/li&gt;&lt;li&gt;Once Google Earth has finished initializing, you can press the buttons to navigate within the program.&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;br /&gt;The final exhibit for our class will be mounted in April.  You will be able to read more about it on the students' &lt;a href="http://digitalhistory.uwo.ca/h513_0708/"&gt;blogs&lt;/a&gt; and at the &lt;a href="http://digitalhistory.uwo.ca/sky/"&gt;exhibit website&lt;/a&gt;.  I'll write about some of the other components in future posts.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/arduino" rel="tag"&gt;Arduino&lt;/a&gt; | &lt;a href="http://technorati.com/tag/geocoding" rel="tag"&gt;geocoding&lt;/a&gt; | &lt;a href="http://technorati.com/tag/google+earth" rel="tag"&gt;Google Earth&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/physical+computing" rel="tag"&gt;physical computing&lt;/a&gt; | &lt;a href="http://technorati.com/tag/Python" rel="tag"&gt;Python&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3735638996140859338?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3735638996140859338'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3735638996140859338'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/02/prototyping-tangible-interface-for.html' title='Prototyping a Tangible Interface for Google Earth'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlp9U4w3uI/AAAAAAAAALI/LJ_wjM7_hGM/s72-c/google-earth-arduino.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1369990004834288383</id><published>2008-02-04T11:15:00.000-05:00</published><updated>2008-02-04T12:08:39.945-05:00</updated><title type='text'>Freedom of Expression</title><content type='html'>I've been working pretty hard for a couple of months on &lt;span style="font-style:italic;"&gt;&lt;a href="http://digitalhistoryhacks.blogspot.com/2008/01/programming-historian.html"&gt;The Programming Historian&lt;/a&gt;&lt;/span&gt;, and almost all of the code that I'm writing is going directly into the book, rather than appearing here in lightly-documented bits and pieces.  Over the next year I expect to find a balance between hacking in expository mode (the book) and hacking for its own sake.  Each, of course, informs the other.&lt;br /&gt;&lt;br /&gt;In other news, I recently picked up a diminutive and very inexpensive &lt;a href="http://wiki.eeeuser.com/"&gt;Asus Eee PC&lt;/a&gt; and have to say that I'm quite impressed with it.  Part of my enthusiasm is no doubt for &lt;a href="http://www.linux.org/"&gt;Linux&lt;/a&gt;, which I'm just getting around to exploring.  But the machine itself is pretty sweet, too.&lt;br /&gt;&lt;br /&gt;If you're a regular reader of my blog, you might've found that last bit about Linux surprising or disillusioning or something, so I probably should explain.  When I started programming, the IBM PC hadn't been invented yet, and machines like the Commodore PET, TRS-80 and Apple/Apple II were just becoming available to elementary school kids.  By an accident of school-district purchasing, we had Commodores rather than Apples.  When I was in high school, I talked my way into my first job at the community college (teaching at a "computer camp" for little kids) by claiming to know how to use IBM PCs.  I borrowed a &lt;a href="http://www.computerhope.com/msdos.htm"&gt;DOS&lt;/a&gt; manual and memorized commands over the weekend.  I was only a little embarrassed when I couldn't actually turn on a PC the following Monday morning.  I was looking all over the keyboard for the power switch, which turned out to be a huge red toggle on the side of the case.  (If you look at &lt;a href="http://content.zdnet.com/2346-9595_22-30760.html"&gt;this picture&lt;/a&gt; you can get a sense of my frustration... the power switch is on the back right, hidden by the manuals.)  That summer I became trilingual, adding &lt;a href="http://el.media.mit.edu/Logo-foundation/logo/index.html"&gt;Logo&lt;/a&gt; to my knowledge of BASIC and assembly language.  This is getting to be a pretty long story, so let me skip through the VAX years and my enduring love for LISP/Scheme and functional programming, and get to the early to mid 1990s, which I spent working in a Unix shop.  Right around the time that Linux was taking off, however, I began working with a succession of graduate supervisors (and later colleagues) who were based firmly in the Windows world. And that's how I missed Linux.  Until now.&lt;br /&gt;&lt;br /&gt;It's surprising to me how fast my Unix experience came back when I popped open a shell and started typing commands.  That part is neither here nor there.  What's really great about open source, however, is that &lt;span style="font-weight:bold;"&gt;I am free to fix anything that is bugging me&lt;/span&gt;.  As soon as I had that feeling of freedom, it all came back to me again.  Why would anyone give that up?  How had I ended up in a situation where I didn't feel that way?&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/free+software" rel="tag"&gt;free software&lt;/a&gt; | &lt;a href="http://technorati.com/tag/gnu+linux" rel="tag"&gt;GNU/Linux&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+source" rel="tag"&gt;open source&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1369990004834288383?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1369990004834288383'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1369990004834288383'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/02/freedom-of-expression.html' title='Freedom of Expression'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3893658240674223177</id><published>2008-01-20T09:40:00.000-05:00</published><updated>2008-01-20T10:45:07.468-05:00</updated><title type='text'>Relevance Feedback</title><content type='html'>Suppose Albert, Betty and Chris are historians of food, technology and Indonesia, respectively.  It's not hard to imagine a scenario where each might sit down in front of Google and type "java" into the search box.  One of the key problems of designing a search engine is trying to find a way to order the results so that the highest ranked hits will be relevant to the most users.  In this case, let's assume that Google isn't tracking any of the three (i.e., they aren't logged in to GMail or other services and they aren't using their own computers).  I just tried this search while logged in to Google and the top 12 results were relevant to the computer language, followed by one hit for the Indonesian island, followed by thirty-seven more for the computer language.  I stopped counting.  I love coffee, but I don't read about it or buy it online, so it is possible that my searching history helps Google know that I'm probably looking for information about the programming language.  It's also possible that most people who use Google are looking for information about the programming language.&lt;br /&gt;&lt;br /&gt;Google's default assumption in this case is good news for Betty, and not such good news for Albert or Chris.  Each of them could go on to refine their search, of course.  One obvious possibility would be to add keywords ("java +coffee") or subtract them ("java -programming") or both.  But the fact remains that Betty will find what she is looking for immediately, while the other two won't without more digging.  It is easy to see how repeated experiences might shape a person's experience of the web, leading them to see it as a place of &lt;a href="http://www.historycooperative.org/journals/ahr/108.3/rosenzweig.html"&gt;scarcity or abundance&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Without knowing more about what a particular searcher is after, it is very difficult to do better than to match the distribution of result relevance to something else that can be measured easily.  That may be a measurement of the importance or centrality of sets of documents, or a survey of what users are looking for when they enter popular keywords, or any number of other measures singly or in or combination.  Search engine companies can also measure the click-through for particular links.  If most people click on one of the results on the first page of hits and then don't repeat or modify their search, the company can infer that the result was probably relevant to the searcher's needs.&lt;br /&gt;&lt;br /&gt;Machine learning methods are often categorized as "supervised" or "unsupervised."  In the former case, the system gets feedback telling it what is, or even better, what is not, a correct answer.  Unsupervised methods don't receive any feedback, which usually makes their task much more difficult.  If we cast search engine relevance in these terms, we can see that the system faces a task which is only partially supervised at best.&lt;br /&gt;&lt;br /&gt;In informational retrieval systems that were created before the web, users typically learned to construct elaborate queries and to refine their queries based on the results that they received.  These systems often included a way for the user to provide relevance feedback.  In the context of the web, queries are typically only a word or two long, and most search engines don't include a mechanism for the searcher to provide direct relevance feedback.  This may be good enough for web searchers taken as a group (it may even be optimal), but it imposes a cost on individual researchers.  Researchers need to be able to find obscure sources, and the best way to do this is to pair them with a system that can learn from relevance feedback.  Digital humanists need tools that go beyond the &lt;a href="http://www.dancohen.org/2006/04/17/the-single-box-humanities-search/"&gt;single box search&lt;/a&gt;.  And we're probably going to have to write them ourselves.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; |  &lt;a href="http://technorati.com/tag/search" rel="tag"&gt;search&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3893658240674223177?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3893658240674223177'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3893658240674223177'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/01/relevance-feedback.html' title='Relevance Feedback'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3740981537118297304</id><published>2008-01-14T15:10:00.000-05:00</published><updated>2008-01-14T15:36:39.095-05:00</updated><title type='text'>The Programming Historian</title><content type='html'>My colleague &lt;a href="http://history.uwo.ca/faculty/maceachern/"&gt;Alan MacEachern&lt;/a&gt; and I have decided to write a book to teach practicing historians how to use programming to augment their ability to do research online.  &lt;span style="font-style:italic;"&gt;The Programming Historian&lt;/span&gt; will be provided as an open access work via the website of &lt;a href="http://niche.uwo.ca"&gt;NiCHE: Network in Canadian History &amp;amp; Environment&lt;/a&gt;.  We'll announce the details soon.  In the meantime, here are a few things that will make this work different from existing books about programming...&lt;br /&gt;&lt;br /&gt;1. We think that you should be able to put what you learn to work in your research practice immediately. Many beginning programmers lose patience because they can't see why they're learning what they're learning.&lt;br /&gt;&lt;br /&gt;2. Digital history requires working with sources on the web. This means that you're going to be spending most of your research time working in a browser, so you should be able to use your programming skills in the browser.&lt;br /&gt;&lt;br /&gt;3. Our examples will build on real historical sources online and on open source projects in the digital humanities.  In particular, the programs that you create will be tightly integrated with &lt;a href="http://www.zotero.org/"&gt;Zotero&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;4. We'll draw on a wide range of techniques from information retrieval; text, data and web mining; statistical natural language processing; machine learning; and other disciplines.&lt;br /&gt;&lt;br /&gt;If you'd like to contact us with questions or comments, there is contact information on our faculty web pages: &lt;a href="http://history.uwo.ca/faculty/turkel/"&gt;Turkel&lt;/a&gt; &amp;amp; &lt;a href="http://history.uwo.ca/faculty/maceachern/"&gt;MacEachern&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/browser" rel="tag"&gt;browser&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+access" rel="tag"&gt;open access&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+source" rel="tag"&gt;open source&lt;/a&gt; | &lt;a href="http://technorati.com/tag/programming" rel="tag"&gt;programming&lt;/a&gt; | &lt;a href="http://technorati.com/tag/zotero" rel="tag"&gt;Zotero&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3740981537118297304?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3740981537118297304'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3740981537118297304'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/01/programming-historian.html' title='The Programming Historian'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-531557957211819012</id><published>2008-01-08T10:09:00.000-05:00</published><updated>2008-01-08T12:00:10.448-05:00</updated><title type='text'>Results When and Where You Need Them</title><content type='html'>In my previous post I complained about a taken-for-granted model that carves the research process into discrete stages of information gathering, analysis, writing and publication.  As I noted, I don't think that this model really makes sense anymore.  I've been trying to figure out where it came from, and more to the point, why it persists.&lt;br /&gt;&lt;br /&gt;We all have preferred ways of coming up with explanations, and one of my favorites is to start with an unshakeable belief in the &lt;a href="http://www.amazon.com/2nd-Law-Scientific-American-Paperback/dp/0716760061/"&gt;second law of thermodynamics&lt;/a&gt; and go from there.  In the wake of any event, there are a range of material and documentary sources that can be used to make inferences about what happened.  Time continues, however.  Memories are reworked, documents are lost, physical evidence decays and is disrupted.  Contexts for understanding various pasts change, too, of course.  We might even say that "all is flux."  Against this inexorable dissolution, we've tried to create little islands of stasis.  These include libraries, museums and archives, and also brass plaques, time capsules, heirloom species, national parks, and mathematical laws.&lt;br /&gt;&lt;br /&gt;In &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Into-Cool-Energy-Flow-Thermodynamics/dp/0226739376/"&gt;Into the Cool&lt;/a&gt;&lt;/span&gt;, Schneider and Sagan summarize the second law by saying that "nature abhors a gradient."  To the extent that we don't, we have to pay to maintain them.  For example, there are &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/04/information-costs.html"&gt;information and transaction costs&lt;/a&gt; associated with learning anything. (In this case, the gradient that you are trying to maintain is your own wit and wisdom.  If you're reading this, you may find it easier than your waistline, but they're all losing battles in the long run).  In the past, these costs were highest for moving historians to distant documents and keeping them near those documents temporarily.  When I did archival work and fieldwork for my dissertation, I was acutely aware of the cost of being 3,000 miles from home.  I had the sense that it &lt;span style="font-style:italic;"&gt;really&lt;/span&gt; mattered which box I requested next at the archive, or which place I decided to visit in the field.  Many researchers describe having had similar experiences... it's part of the fun, the frisson, of archival work.  But the high cost of doing research in the material world forces research time into clumps.&lt;br /&gt;&lt;br /&gt;Most academic researchers also have to teach to support themselves, and this introduces another kind of temporal clumping.  Research trips are rarely taken during the school year, and writing is often deferred, too.  I'm trying hard to suffuse my own research and writing throughout the year, but I'm aware that I went for 25 days without posting to my blog last December, and have written five posts in the last 12 days.  I start teaching again tomorrow, attending job talks, and so on.&lt;br /&gt;&lt;br /&gt;I'm not going to change costs associated with working in the material world, of course.  I'm not going to change the university calendar to a year-round, part-time engagement, either.  But to the extent that the digital world changes the landscape of transaction and information costs that we face, it will make a big difference in our shared research model.&lt;br /&gt;&lt;br /&gt;As I see it, many of the programs that we are currently using impede the unification of the research process.  At a minimum, most historians probably rely on a word processor and web browser.  They may also use a spreadsheet, bibliographic database and more specialized programs like an RSS feed reader, relational database, statistical package, GIS, or concordancer.  Each of these programs is designed to be "sovereign," to use &lt;a href="http://www.chi-sa.org.za/articles/posture.htm"&gt;Alan Cooper&lt;/a&gt;'s term, to be "the only [program] on the screen, monopolizing the user's attention for long periods of time."  The move to Web 2.0 has put a lot of functionality in the browser, and programs like &lt;a href="http://www.zotero.org/"&gt;Zotero&lt;/a&gt; are clearly a step in the right (only) direction.  But the fact remains that most of our own research materials are locked into little silos.  Moving from one of these silos to another imposes its own granularity on our activities. &lt;br /&gt;&lt;br /&gt;How could this be different?  Think of your Zotero bibliography as the core of your research process.  Every item in it is there because it is relevant to your work.  Suppose you keep your notes and drafts in Zotero, too.  Then for the purposes of digital history, &lt;span style="font-style:italic;"&gt;a good statistical description of your Zotero database is the best and most up-to-the-minute description of your research process&lt;/span&gt;.  That description will be more accurate to the extent that you can incorporate other streams of information into it, like the feeds that you read, the books that you purchase, and the research-related web searches that you do.  I think that the development of Zotero in the near future will allow more and more of this kind of incorporation, and the fact that the software is open source and provides an API bodes well for using it as a platform for mining.  The key point that I want to emphasize, however, is that measurements of your Zotero bibliography will be most useful to the extent that they are fed back into your research in a useful way.  Suppose you do a quick analysis of a text that you are in the process of reading.  It is quite simple to provide the results of that analysis both as information that you can read, and as a vector that can be used to refine automatic searching or spidering for related material.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/analysis+synthesis" rel="tag"&gt;analysis and synthesis&lt;/a&gt; | &lt;a href="http://technorati.com/tag/browser" rel="tag"&gt;browser&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/electronica" rel="tag"&gt;entropy&lt;/a&gt; | &lt;a href="http://technorati.com/tag/flows" rel="tag"&gt;flows&lt;/a&gt; | &lt;a href="http://technorati.com/tag/information+costs" rel="tag"&gt;information costs&lt;/a&gt; | &lt;a href="http://technorati.com/tag/zotero" rel="tag"&gt;Zotero&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-531557957211819012?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/531557957211819012'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/531557957211819012'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/01/results-when-and-where-you-need-them.html' title='Results When and Where You Need Them'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7506543441300575209</id><published>2008-01-05T09:42:00.000-05:00</published><updated>2008-01-05T11:16:37.975-05:00</updated><title type='text'>All is Flux</title><content type='html'>If you wanted a motto for digital history, it's hard to imagine finding anything better than the one that &lt;a href="http://plato.stanford.edu/entries/heraclitus/"&gt;Heraclitus&lt;/a&gt; is supposed to have come up with around 500 BCE, when he said something to the effect that 'all is flux' or 'everything flows' or 'you can't step into the same river twice'.&lt;br /&gt;&lt;br /&gt;I think that many historians have a research model which looks a bit like this:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Formulate question&lt;/li&gt;&lt;li&gt;Do research&lt;ol&gt;&lt;li&gt;Collect a bunch of sources&lt;/li&gt;&lt;li&gt;Decide which look most promising and skim through those&lt;/li&gt;&lt;li&gt;Read the most relevant ones carefully&lt;/li&gt;&lt;li&gt;Take good notes&lt;/li&gt;&lt;/ol&gt;&lt;/li&gt;&lt;li&gt;Write&lt;/li&gt;&lt;li&gt;Publish&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;We all agree that the stages of the research process are indistinct and blend into one another.  We all agree that there is a lot of movement to-and-fro and back-and-forth, and time for visions and revisions.  Nevertheless, this research model--what the heck, let's call it &lt;a href="http://plato.stanford.edu/entries/democritus/"&gt;Parmenidean&lt;/a&gt;--is widely enough understood that many professors ask their graduate students questions like "Have you done your research yet?" or "When are you going to start writing?"  The students, in turn, reply with answers that may please or displease their advisors, but which are understood to be &lt;a href="http://www.unc.edu/~gerfen/Ling30Sp2002/pragmatics.htm"&gt;felicitous&lt;/a&gt; in the pragmatic sense.&lt;br /&gt;&lt;br /&gt;Digital historians, on the other hand, have to be thoroughgoing Heracliteans and reject questions like "Have you done your research yet?"  The only sensible way to do research online is to be doing everything all at once all the time.  The research model looks like this:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Until your interpretation stabilizes...&lt;ul&gt;&lt;li&gt;You keep refining your ensemble of questions&lt;/li&gt;&lt;li&gt;Your spiders and feeds provide a constant stream of potential sources&lt;/li&gt;&lt;li&gt;Unsupervised learning methods reveal clusters which help to direct your attention&lt;/li&gt;&lt;li&gt;Adaptive filters track your interests as they fluctuate&lt;/li&gt;&lt;li&gt;You create or contribute to open source software as needed&lt;/li&gt;&lt;li&gt;You write/publish incrementally in an open access venue&lt;/li&gt;&lt;li&gt;Your research process is subject to continual peer review&lt;/li&gt;&lt;li&gt;Your reputation develops&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;Do we have what we need to fully implement this strategy?  A lot of the pieces are already in place, including massive textual databases, search engines with APIs, XML, RSS feeds and feed readers, high-level programming languages, and tools for online scholarship like &lt;a href="http://www.zotero.org/"&gt;Zotero&lt;/a&gt;.  The combined literature of statistical natural language processing, text and data mining, machine learning, and information retrieval provide a cornucopia of useful techniques.  If you know how to program you're already most of the way there; if not, now is as good a time as any to begin learning how.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/flows" rel="tag"&gt;flows&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7506543441300575209?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7506543441300575209'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7506543441300575209'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/01/all-is-flux.html' title='All is Flux'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7900031038102322384</id><published>2008-01-02T10:07:00.000-05:00</published><updated>2008-01-02T10:41:19.962-05:00</updated><title type='text'>What's the Opposite of Big History?</title><content type='html'>A couple of times in this blog, I've mentioned &lt;span style="font-style:italic;"&gt;big history&lt;/span&gt;, an ambitious attempt to narrate history from the big bang to the present.  Like microhistory, the Annales school, environmental history, and a few other thematic approaches to the discipline, one of the things that big history teaches us is that we can learn something different by judiciously manipulating the scale of our inquiry.&lt;br /&gt;&lt;br /&gt;By providing us with access to completely new kinds of sources, digital history opens up some additional possibilities for manipulating scale.  Consider, for example, the &lt;a href="http://en.wikipedia.org/wiki/Cache"&gt;cached data&lt;/a&gt; provided by Google and other search engines.  When you do a search you have the option of following the provided link, or of seeing what the page looked like when Google's spiders last visited.  The date and time that the copy was cached is also provided, and it is straightforward to write a program to retrieve the current page and cached copy and compare them to see what has changed.  As a test, I did a Google search for "digital history" on 2 Jan 2008 at 15:05 GMT and recorded the times that the cached copy had been created for each page on the first page of hits.  Sorted by duration, the results were: 3 days 8 hours 14 minutes, 3d 10h 24m, 3d 15h 14m, 3d 15h 36m, 4d 14h 50m, 5d 3h 12m, 5d 5h 20m, 5d 8h 24m, and 257d 15h 52m.&lt;br /&gt;&lt;br /&gt;Now suppose you wanted to write the history of a very brief interval, say a few hours, minutes or even seconds.  In the past, this kind of history--I'm not sure what to call it--would only have been possible for an event like 9/11, the JFK assassination or D-Day.  But with access to Google's cache data and some sophisticated data mining tools, it becomes possible to imagine creating rich snapshots of web activity over very short intervals.  And to the extent that web activity tracks real world activity and can be used to make inferences about it, it becomes possible to imagine writing the history of one second on earth, or one millisecond, or one microsecond.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/cache" rel="tag"&gt;cache&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/scale" rel="tag"&gt;scale&lt;/a&gt; | &lt;a href="http://technorati.com/tag/search" rel="tag"&gt;search&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7900031038102322384?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7900031038102322384'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7900031038102322384'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/01/whats-opposite-of-big-history.html' title='What&apos;s the Opposite of Big History?'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-757694620118922989</id><published>2008-01-01T11:13:00.000-05:00</published><updated>2008-01-01T12:16:47.275-05:00</updated><title type='text'>The Search Comes First</title><content type='html'>In December, I had a chance to visit humanists at a couple of universities in the Boston area and talk about digital history.  One kind of question that came up repeatedly was foundational: What do humanists really need to know in order to be more effective online researchers?  What should they learn first?  What constitutes a baseline literacy?  How can digital humanists be introduced into existing departments, or the techniques of digital humanities be added to existing curricula?&lt;br /&gt;&lt;br /&gt;Since I started this blog two years ago and began teaching digital history classes, I've had the chance to revisit these questions a number of times.  My &lt;a href="http://digitalhistoryhacks.blogspot.com/2005/12/teaching-young-historians-to-search.html"&gt;original answer&lt;/a&gt; was that it all begins with &lt;span style="font-weight:bold;"&gt;search&lt;/span&gt;, and I think that still holds.  For me, the essence of digital history is the shift to what Roy Rosenzweig called a "&lt;a href="http://www.historycooperative.org/journals/ahr/108.3/rosenzweig.html"&gt;culture of abundance&lt;/a&gt;."  The internet is unimaginably large and growing exponentially.  Individual researchers, on the other hand, have a sharply bounded capacity to absorb or make sense of new material.&lt;br /&gt;&lt;br /&gt;I think that a lot of historians are resistant to the idea of processing documents computationally, because they think of it as a challenge to, or supplement for, reading.  Instead, computation should be seen as a way to augment human abilities.  We still need human beings to read and interpret sources, and we must still train our students in traditional philological techniques.  There's no getting around the fact, however, that the way that we &lt;span style="font-style:italic;"&gt;find&lt;/span&gt; sources has drastically changed in the last ten or fifteen years.&lt;br /&gt;&lt;br /&gt;According to &lt;a href="http://searchenginewatch.com/showPage.html?page=3627304"&gt;Search Engine Watch&lt;/a&gt;, as of this past summer search engines worldwide were handling about 61 billion searches per month.  More than half of these were handled by Google, making its ranking algorithms the most pervasive source of bias in the history of research.  It's clear that humanists need to understand how search engines work, and need to be able to parameterize their searches to get the best results.  Your ability to do a virtuoso close reading is irrelevant if you can't find the sources to read in the first place.  Humanists who wish to place their own material online also need to understand search engine technology, because it is the deciding factor in whether a work can be found, read and cited.&lt;br /&gt;&lt;br /&gt;In my conversations last month, the follow-up question was usually whether or not historians and other humanists will need to be able to program computers.  I'm not sure about the answer to that.  I'm certain that &lt;span style="font-style:italic;"&gt;some&lt;/span&gt; of them will.  The discipline of history is in for some interesting times, as interpretations backed by intensive research in a few archives will be confronted with those backed by machine learning or text mining of massive datasets.  My hope is that we'll find a rapprochement... but then I'm an optimist.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/clues" rel="tag"&gt;clues&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/findability" rel="tag"&gt;findability&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+education" rel="tag"&gt;history education&lt;/a&gt; | &lt;a href="http://technorati.com/tag/search" rel="tag"&gt;search&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-757694620118922989?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/757694620118922989'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/757694620118922989'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2008/01/search-comes-first.html' title='The Search Comes First'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-6561433844916213468</id><published>2007-12-28T15:46:00.001-05:00</published><updated>2008-12-29T19:15:10.544-05:00</updated><title type='text'>On Blogs and Frogs</title><content type='html'>[Cross-posted to Cliopatria &amp;amp; Digital History Hacks]&lt;br /&gt;&lt;br /&gt;Last week some time, my eyes popped open in the middle of the night and I realized that it's been quite a while since I blogged.  I was too tired to get up and rectify the situation, but, of course, that didn't stop me from lying there half-awake and thinking about blogging.  My mind turned to the fact that I've been even more remiss about cross-posting to Cliopatria from time to time. I imagined that some Cliopatrians (O.K., &lt;a href="http://www.ralphluker.com/"&gt;Ralph E. Luker&lt;/a&gt;) were probably posting more than a hundred times for each one time that I managed to.  &lt;br /&gt;&lt;br /&gt;From there I got to thinking about the students in my &lt;a href="http://digitalhistory.uwo.ca/h513_0708/"&gt;digital history grad class&lt;/a&gt;.  They have to blog as the written component of their coursework.  Although I'm very explicit about my preference for quality over quantity, you'd think that they would be motivated to produce approximately the same amount of written work as one another.  Nevertheless, I had a sense that there could easily be an order of magnitude difference in output between the most and least-frequent posters.  I tried to visualize what the distributions would look like: probably a &lt;a href="http://www.shirky.com/writings/powerlaw_weblog.html"&gt;power law&lt;/a&gt;.  Since that night, I've had a chance to check.  The figure below shows the number of times that various members of Cliopatria and of my grad class posted between the beginning of September and now.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVloAKq2P7I/AAAAAAAAALA/zkpNIS75tXc/s1600-h/blogging-activity.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 130px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVloAKq2P7I/AAAAAAAAALA/zkpNIS75tXc/s200/blogging-activity.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285369989769346994" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I think most academics, including my students, quickly learn that they have strong preferences for some kinds of writing rather than others.  One person likes to write abstruse monographs, one popular books, one carefully-crafted essays.  Some of us have found that we're able to blog and some people seem to be especially good at it.  There's an ecology of scholarly production, and we all have to find our niche.&lt;br /&gt;&lt;br /&gt;So I was lying there thinking about blogs and I realized that it reminded me of something, what was it? Oh yeah, frog communication.  (It was the middle of the night.)  Many years ago I read an utterly charming paper on the subject in &lt;span style="font-style:italic;"&gt;Scientific American&lt;/span&gt;, and it's stuck with me (Peter M. Narins, "Frog Communication," &lt;span style="font-style:italic;"&gt;Sci Am&lt;/span&gt;, Aug 1995, 78-83).  In its efforts to attract females, the male coqui, a tiny Puerto Rican frog, makes a chirping call that is louder than a jackhammer.  This raises many questions, not the least of which is "how [does] such a small creature protect itself from its own racket?"  The answer turns out to be a fascinating lesson in evolution and engineering, so be sure to read the paper.  What's interesting from the point of view of blogging, or scholarly production more generally, is that the frogs also have a special neural mechanism that follows the periodic calls made by other creatures, predicts windows of relative silence, and allows them to blast their own calls into the gaps.&lt;br /&gt;&lt;br /&gt;Now based on my own experience to date, I rarely blog in response to external factors.  Instead, I blog when I can get up the gumption to do so.  Like many scholars, I've grown used to the idea that when you write something, you're adding it to a body of knowledge that is growing, if not monotonically, at least pretty steadily.  On that view, the relative timing of different contributions doesn't matter so much, unless you're in a race for the Nobel prize or something.  As historians, we can usually afford to take the long view.&lt;br /&gt;&lt;br /&gt;Frogs, however, don't take the long view.  As Charles F. Hockett argued in another classic &lt;span style="font-style:italic;"&gt;Scientific American&lt;/span&gt; article, human language is apparently unique among animal communication systems because it allows us "to talk about things that are remote in space or time (or both) from where the talking goes on" ("The Origin of Speech," &lt;span style="font-style:italic;"&gt;Sci Am&lt;/span&gt;, Sep 1960, 89-96).  For the frog, there's &lt;span style="font-style:italic;"&gt;right here, right now&lt;/span&gt;, give or take a few hundred milliseconds to squeeze in the call where it is most likely to be heard.&lt;br /&gt;&lt;br /&gt;Thinking about blogging as a contribution to an infinite archive pushes us a bit too close to the frog's view of the world for comfort.  Imagine having to squeeze your post in &lt;span style="font-style:italic;"&gt;right here, right now&lt;/span&gt;, the only place where it has a hope of making any difference for anybody. The history blogosphere is already too vibrant, too far-flung for most people to monitor effectively.  As more voices are added to the cacophony it's going to become harder and harder to be heard. How can we hope to get it right?  Here's where we have a real advantage over the frog.  We have the ability to create machines which simulate neural and evolutionary processes.  Imagine the blogger of the future, augmented by an artificial system that monitors discourse, predicts gaps and pops in your contribution when and where it's most likely to be cited.  Over time, the system learns what you are capable of, and becomes more effective at getting your message out.  Does that sound crazy?  Ribbit!&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/blogs" rel="tag"&gt;blogs&lt;/a&gt; | &lt;a href="http://technorati.com/tag/cliopatria" rel="tag"&gt;Cliopatria&lt;/a&gt; | &lt;a href="http://technorati.com/tag/eleutherodactylus+coqui" rel="tag"&gt;Eleutherodactylus coqui&lt;/a&gt; | &lt;a href="http://technorati.com/tag/findability" rel="tag"&gt;findability&lt;/a&gt; |  &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-6561433844916213468?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6561433844916213468'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6561433844916213468'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/12/on-blogs-and-frogs.html' title='On Blogs and Frogs'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_mg_RqiBYrpE/SVloAKq2P7I/AAAAAAAAALA/zkpNIS75tXc/s72-c/blogging-activity.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4357493090929191439</id><published>2007-12-03T08:18:00.000-05:00</published><updated>2007-12-03T08:58:02.887-05:00</updated><title type='text'>Geo-DJ, Part 1: The Idea</title><content type='html'>In some of my earlier research on what I called &lt;a href="http://digitalhistory.uwo.ca/pbc/"&gt;place-based computing&lt;/a&gt;, I used handheld and tablet computers with GPS receivers to present archival materials to historians during fieldwork.  Say you are standing in front of an old building.  The system uses the GPS to determine your location, which is plotted in a geographic information system (GIS).  The GIS layers include &lt;a href="http://www.gsd.harvard.edu/gis/manual/georeferencing/index.htm"&gt;georeferenced&lt;/a&gt; historical maps and aerial photographs, so you can see what was around your present position (or thought to be around your position) when those historical representations were created.  The GIS also includes hyperlinks to other kinds of historical sources, like photographs of buildings and streetscapes, census returns, newspaper articles, city directories, and so on.  You can click on a digitized source to consult it, comparing it with the material sources that accumulate naturally in the &lt;a href="http://www.amazon.com/Archive-Place-Unearthing-Chilcotin-Plateau/dp/0774813768/"&gt;archive of place&lt;/a&gt;. The system tested pretty well for individual researchers and small walking tours, although our prototypes were not very robust, had relatively short battery lives and could be difficult to read in direct sunlight.&lt;br /&gt;&lt;br /&gt;The system that I am designing now, the &lt;span style="font-style:italic;"&gt;&lt;span style="font-weight:bold;"&gt;geo-DJ&lt;/span&gt;&lt;/span&gt;, expands this work into an ambient, auditory dimension.  Imagine walking around outside with an iPod-like device that is playing an electronic soundtrack. The music changes as you move, reflecting the historical land use patterns of the area that you are exploring. You may choose to represent patches of original forest with a flute, a dairy farm with bass viol and cow bells, a factory with a percussion ensemble, a slaughterhouse with discordant horns. As you walk towards the site of an old factory, the sounds of percussion rise in volume to dominate the music. Like the earlier place-based systems, the geo-DJ includes a GPS receiver and is based on GIS technology.  The system determines your present position, then calculates distance and direction from the &lt;a href="http://www.gsd.harvard.edu/gis/manual/vector/index.htm"&gt;centroids&lt;/a&gt; of the historical features of interest.  That data will then be used to mix the audio tracks that represent each feature.  &lt;br /&gt;&lt;br /&gt;At the moment, I'm working with a number of different hardware designs.  The easiest ones to build will make use of the same handheld / GPS / GIS platform that I used earlier.  I'm also experimenting with using dedicated audio hardware and microcontrollers like &lt;a href="http://www.arduino.cc/"&gt;Arduino&lt;/a&gt;.  Although I envision using the system as a &lt;a href="http://digitalhistoryhacks.blogspot.com/search?q=%22history+appliance%22"&gt;history appliance&lt;/a&gt;, many other applications are possible.  I'll be posting software and hardware notes here for other people who want to hack the geo-DJ.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/electronica" rel="tag"&gt;electronica&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historical+consciousness" rel="tag"&gt;historical consciousness&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/place" rel="tag"&gt;place&lt;/a&gt; | &lt;a href="http://technorati.com/tag/place+based+computing" rel="tag"&gt;place-based computing&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4357493090929191439?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4357493090929191439'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4357493090929191439'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/12/geo-dj-part-1-idea.html' title='Geo-DJ, Part 1: The Idea'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4153418436154074824</id><published>2007-11-20T10:16:00.001-05:00</published><updated>2008-12-29T19:12:13.567-05:00</updated><title type='text'>Physical Computing Cards</title><content type='html'>In most public history graduate programs (including the one that I teach in) students get a good grounding in presenting history to the public in the form of images, texts and objects of material culture.  &lt;a href="http://history.uwo.ca/gradstudy/publichistory/"&gt;Our program&lt;/a&gt;, like a growing number of others, also emphasizes the public historian's need to be able to communicate using various new media.  Each year we try to add new tools and new techniques.  The digital world is, of course, changing much faster than we can keep up with; typical undergraduate curricula change a lot less rapidly than I'd like.  Our students respond to the challenge in various ways.  Some seem to dislike drinking from the firehose while &lt;a href="http://p-stewart.blogspot.com/2007/11/ending-our-fences.html"&gt;others&lt;/a&gt; are more willing to take it in stride.&lt;br /&gt;&lt;br /&gt;I don't think that the idea of simply &lt;span style="font-style:italic;"&gt;presenting&lt;/span&gt; history goes far enough, however.  Over the past few years, I've begun to think of the public historian's problem as one of interaction design.  When we've done our job, we will be able to describe not only how members of the public respond to our work, but how our work responds to them.  It will be appropriate, in other words, to think of our work in terms of how it &lt;span style="font-style:italic;"&gt;behaves&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;For my long-suffering students, this means the need to learn more about computers than many would prefer.  The computer, after all, is the most behaviorally plastic artifact that has ever been created.  If we can specify an interaction algorithmically, we can implement at least part of it on a computer.  Public history,  however, is conducted in a number of venues and settings that make it impractical to use a desktop or laptop computer.  In previous years the public history students, some colleagues and I have used GPS-enabled handheld computers to move pieces of the archive into the field (more information &lt;a href="http://digitalhistory.uwo.ca/pbc/"&gt;here&lt;/a&gt;).  This year, I'm trying to expand our repertoire to include the use of microcontrollers and transducers, an approach that is nicely covered in O'Sullivan and Igoe's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Physical-Computing-Sensing-Controlling-Computers/dp/159200346X/"&gt;Physical Computing&lt;/a&gt;&lt;/span&gt; and Igoe's new &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Making-Things-Talk-Practical-Connecting/dp/0596510519/"&gt;Making Things Talk&lt;/a&gt;&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;Most of my students have had little or no exposure to electronics and don't really have a sense of how to put off-the-shelf hardware modules together to create useful effects.  We don't have a workshop space where people can solder (at least not yet) and don't have enough equipment for each student to build his or her own project.  To get around some of these difficulties, I decided to create a collection of cards that can be laser printed on business card stock.  Each card shows a picture of a device and has little glyphs along the sides that indicate how it can be combined with other devices.  The basic scheme is laid out like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlmFrpMQTI/AAAAAAAAAKg/xn3JxfNlMTo/s1600-h/analog-digital.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 114px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlmFrpMQTI/AAAAAAAAAKg/xn3JxfNlMTo/s200/analog-digital.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285367885496860978" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I'm planning to use the cards in studio by talking through some of the basic principles of physical computing and describing how particular effects or installations might be created.  Suppose, for example, that you wanted a museum exhibit to sense the presence of a viewer, try to figure out if it was a child or adult, and adjust the wording of the artifact captions accordingly.  One way to implement something like that would be to use force sensitive resistors hidden in a floor mat to determine the person's weight, and establish one or more thresholds to set an appropriate caption, which would then be displayed on an LCD.  All of the computation could be done onboard a microcontoller module like &lt;a href="http://www.arduino.cc"&gt;Arduino&lt;/a&gt;.  Using these cards to create a block diagram the system might look something like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlmN5cOpeI/AAAAAAAAAKo/lMUv5C5dx9Q/s1600-h/weight-caption-example.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlmN5cOpeI/AAAAAAAAAKo/lMUv5C5dx9Q/s200/weight-caption-example.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285368026639541730" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Having explained a bit about how each module works, I can then pose a series of increasingly difficult design challenges and talk through their ideas with them. How would you make a light come on to illuminate a panel when someone approaches?&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlmWsdGIfI/AAAAAAAAAKw/NkKGd1CW4lw/s1600-h/panel-light-example.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlmWsdGIfI/AAAAAAAAAKw/NkKGd1CW4lw/s200/panel-light-example.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285368177772339698" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Given our available equipment, the designs can be more elaborate.  How would you build a Wii-style wireless remote into a replica of some historical scientific apparatus?  One possibility might look something like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlmfu-FOuI/AAAAAAAAAK4/eQp9QC-JAs8/s1600-h/wii-type-example.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 200px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlmfu-FOuI/AAAAAAAAAK4/eQp9QC-JAs8/s200/wii-type-example.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285368333066386146" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;PDF pages  of the cards that I've made so far are &lt;a href="https://sites.google.com/site/digitalhistoryhacks/Home/data-files/physcomp-cards-01-04.pdf"&gt;here&lt;/a&gt; (8Mb). If you are interested in printing your own, you can contact me for a zipped file of the JPEGs of individual cards.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt; | &lt;a href="http://technorati.com/tag/physical+computing" rel="tag"&gt;physical computing&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4153418436154074824?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4153418436154074824'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4153418436154074824'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/11/physical-computing-cards.html' title='Physical Computing Cards'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlmFrpMQTI/AAAAAAAAAKg/xn3JxfNlMTo/s72-c/analog-digital.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5781339801670796716</id><published>2007-11-13T18:22:00.001-05:00</published><updated>2008-12-29T19:03:59.086-05:00</updated><title type='text'>How To: Make a Museum Exhibit Mockup with Free Tools, Part 3</title><content type='html'>In the previous parts (&lt;a href="http://digitalhistoryhacks.blogspot.com/2007/11/how-to-make-museum-exhibit-mockup-with.html"&gt;1&lt;/a&gt;, &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/11/how-to-make-museum-exhibit-mockup-with_13.html"&gt;2&lt;/a&gt;) we built a 3D model of an imaginary museum exhibit in Google SketchUp and then created some views of it to use in a presentation to potential clients.  Those views showed the geometry of the exhibit space, but not much more.  Now we are going to use &lt;a href="http://www.gimp.org/"&gt;The GNU Image Manipulation Program&lt;/a&gt; (GIMP) to modify one of these views to create a more compelling vision of what the exhibit could be like.&lt;br /&gt;&lt;br /&gt;The museum exhibit that I'm using for this demonstration is entirely made up.  I'm going to say that it is about video arcade games in the late twentieth century, with a focus on technology and culture.  Since I don't have any images or artifacts, I'm going to have to use ones that are already online.  I don't want to violate anyone's copyright, so I need to search for images that have a &lt;a href="http://search.creativecommons.org/#"&gt;Creative Commons&lt;/a&gt; license.  I search Flickr for photographs of "video game", "arcade game" and other likely terms, and save links to ones that look promising.&lt;br /&gt;&lt;br /&gt;Software packages like The GIMP, or its commercial cousin Adobe Photoshop, allow you to manipulate almost every aspect of an image and to combine multiple images into one by compositing layers.  Think of this as working with a stack of transparencies.  You can manipulate different pieces of your image on different layers, and when you are ready to produce a final image, you merge them all together.  In its simplest form, this compositing process stacks up the images and figures out what is visible from the top and what isn't.  More sophisticated techniques allow you to use the contents of one layer to influence another. This will become more clear as we go along.&lt;br /&gt;&lt;br /&gt;Let's start with the image of &lt;a href="http://digitalhistory.uwo.ca/dhh/images/mockup-view-entrance.jpg"&gt;the entrance&lt;/a&gt; that we created last time.  We open the file in The GIMP.  I also want to use an image of a &lt;a href="http://flickr.com/photos/scalleja/104706583/"&gt;PacMan graffito&lt;/a&gt; from Barcelona, so open that in The GIMP too.  Starting with the graffito, use Select-&gt;All and Edit-&gt;Copy to put a copy on the clipboard.  Now go to the entrance image and use Edit-&gt;PasteInto to plunk it into the middle.  It doesn't look very good at the moment, but don't worry about that.  If you look at the Layers window in The GIMP you will see that you now have a Background layer (the image of the entrance) and a new Floating Layer on top of it.  If you use your cursor to select the floating layer, and drag the Opacity slider to the left, you will see what you just pasted start to become transparent, so you can see the underlying layer through it.  With 50% opacity, the two images look like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlk0ASixZI/AAAAAAAAAJ4/OKR6z7vissY/s1600-h/mockup-view-entrance-modified-1.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlk0ASixZI/AAAAAAAAAJ4/OKR6z7vissY/s200/mockup-view-entrance-modified-1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285366482289739154" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;What we want to do is move the two little figures left and up, scale them appropriately, and blend them so they look more like they are painted on the wall near the entrance.  First use the Scale tool on the graffito layer and make it about 67% of its original size.  Then Move it into the place where you want it.  Next use the Crop tool to trim the space around the two figures.  Check the "Current layer only" box, draw a rectangle around the figures, and press Enter.  If you make a mistake, undo it.  Go to the Layers window, and where it says Mode: Normal, choose Mode: Hard Light.  Your image should look something like this now:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlk9L0AdeI/AAAAAAAAAKA/ZN6WpsQKxlU/s1600-h/mockup-view-entrance-modified-2.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlk9L0AdeI/AAAAAAAAAKA/ZN6WpsQKxlU/s200/mockup-view-entrance-modified-2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285366640001709538" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Next we want to create some "pills" (the kind that Pac Man ate) and we want the texture to match our two monsters.  Go to the window with the original grafitto and Select a circular region of painted wall, copy it to the clipboard.  Return to the image we're working on, create a New Layer, and use the PasteInto command to paste eight or nine copies of the circle into it.  As you paste each, use Move to arrange them in a row of pills running down the right side of the entrance way.  Adjust the opacity and mode to match the two monsters.  My version now looks like this.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVllF9khN2I/AAAAAAAAAKI/3I7ybb0jv18/s1600-h/mockup-view-entrance-modified-3.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVllF9khN2I/AAAAAAAAAKI/3I7ybb0jv18/s200/mockup-view-entrance-modified-3.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285366790797473634" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;It still needs a bit of pizzazz.  Let's add an image of &lt;a href="http://flickr.com/photos/patrick_q/115638541/"&gt;a joystick&lt;/a&gt; to the lower left hand corner.  Create a new layer and paste the joystick into it.  Align it in the corner, then use the Crop tool to remove the other controller from the original photo, and anything that is overlapping the edges of the museum wall.  Now you can use the Fuzzy Select tool to remove the background from the joystick picture.  Once you've upped the Brightness and Contrast of that layer, you end up with something like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVllNw8G_7I/AAAAAAAAAKQ/3PJvcpNb8Ek/s1600-h/mockup-view-entrance-modified-4.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVllNw8G_7I/AAAAAAAAAKQ/3PJvcpNb8Ek/s200/mockup-view-entrance-modified-4.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285366924845711282" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Now we want to add some text to title our exhibit.  Let's call it "Wakka Wakka: Technology, Culture and Consumption in the History of the Arcade Game."  Choose the Text tool, pick the OCR A Extended font, size 60 pixels, centered.  Create a new layer and type the title.  Use Layer-&gt;DiscardTextInformation to turn the text into a regular layer, and rotate the text so it is at an angle.  Create another layer with the subtitle, using a 30 pixel font.  Use the Hard Light mode to composite both text layers.  My final version looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVllWpf3IUI/AAAAAAAAAKY/oP0ULZpnGGM/s1600-h/mockup-view-entrance-modified-final.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVllWpf3IUI/AAAAAAAAAKY/oP0ULZpnGGM/s200/mockup-view-entrance-modified-final.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285367077467005250" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Using similar techniques, it is possible to modify the other SketchUp stills so they suggest what the exhibit will be like.  I was originally planning to do all of the exhibit views but ran out of time, so I will have to save that for another day.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5781339801670796716?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5781339801670796716'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5781339801670796716'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/11/how-to-make-museum-exhibit-mockup-with_8511.html' title='How To: Make a Museum Exhibit Mockup with Free Tools, Part 3'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlk0ASixZI/AAAAAAAAAJ4/OKR6z7vissY/s72-c/mockup-view-entrance-modified-1.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7508171326769981463</id><published>2007-11-13T10:02:00.001-05:00</published><updated>2008-12-29T18:58:47.598-05:00</updated><title type='text'>How To: Make a Museum Exhibit Mockup with Free Tools, Part 2</title><content type='html'>In the &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/11/how-to-make-museum-exhibit-mockup-with.html"&gt;first part&lt;/a&gt; we made a simple 3D mockup of an imaginary museum exhibit using the freely available &lt;a href="http://sketchup.google.com/"&gt;Google SketchUp&lt;/a&gt; tool.  The great thing about 3d modeling is that it allows you to explore a space from various vantage points.  For our purposes, two points of view are particularly important.  First, how can we best show off this space to a potential client?  We want to find views that can convey the unity of the big picture, and ones that emphasize individual highlights.  The second collection of vantage points that we have to consider, of course, are those of the museum visitors.  This means figuring out the most likely path through the space and the places where someone is probably going to stop, to look at something or to read a panel.  The professional version of SketchUp (the one that you pay for) allows you to create animated walk-throughs which can be particularly convincing.&lt;br /&gt;&lt;br /&gt;Here we'll stick with free tools, so we are going to think of our next step as creating a storyboard.  Our goal is to turn a full three-dimensional world into a linear narrative and an accompanying series of two-dimensional still shots, not something that most historians are trained to do.  I've found books like &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Understanding-Comics-Invisible-Scott-Mccloud/dp/006097625X/"&gt;Understanding Comics&lt;/a&gt;&lt;/span&gt; and &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Film-Directing-Visualizing-Concept-Productions/dp/0941188108/"&gt;Film Directing: Shot by Shot&lt;/a&gt;&lt;/span&gt; to be very useful resources for the process of storyboard design.&lt;br /&gt;&lt;br /&gt;Start SketchUp and load the &lt;a href="http://digitalhistory.uwo.ca/dhh/images/mockup.skp"&gt;exhibit mockup&lt;/a&gt;.  Start &lt;a href="http://www.gimp.org/"&gt;The GIMP&lt;/a&gt; while you're at it.  For the exhibit proposal that we're making, we are going to want one shot that shows the whole space at a glance.  This should be elevated enough so that it won't be confused with any realistic vantage point within the exhibit.  We want to suggest a microcosm, and, by implication, a powerful viewer.  In SketchUp, use the Orbit tool to get a view into your space that shows the interior walls and is looking down from a relatively steep angle.  Now choose the Zoom tool, type &lt;em&gt;55&lt;/em&gt; and press Enter to get a wide field of view.  Click Zoom to Extents so that your space fills the screen.  You can try adjusting angles until you get a view you like.&lt;br /&gt;&lt;br /&gt;Next you are going to output some 2D pictures of your space as JPEG files.  JPEG images are compressed, which means that they are small and easy to work with, and usually of good enough quality for online presentation.  For archival work, or if you were planning to print your images, you would probably use Tagged Image Files (TIF) instead.  The TIF format retains a maximum amount of information, which means that the images are often very large but of high quality.&lt;br /&gt;&lt;br /&gt;If you are happy with your view, use File-&gt;Export-&gt;2DGraphic to create a JPEG.  Use the Options button to set the image size so that (a) the width is 1024 pixels and the height is a little greater than 768 pixels, OR (b) the height is 768 pixels and the width is a little greater than 1024 pixels.  What you don't want is a situation where the width is less than 1024 or the height less than 768 or both.  You can monkey around with this until you see what I mean.  Save your JPEG and then go to The GIMP and open it up.  Ignore any warning messages you get; the software will do the right thing with your file.&lt;br /&gt;&lt;br /&gt;In The GIMP, choose Image-&gt;CanvasSize.  The width and height of the file are shown at the top, joined together on the right by a little chain.  If you try to change one value, the other changes automatically, because the two values are chained together. Click on the Chain to break the link.  Set the width to 1024, the height to 768 and click Resize.  Now choose File-&gt;Save and click Export.  This will save your file with the new dimensions.  Use File-&gt;Close to close the image in The GIMP.  Go out to Windows, right-click on your file and choose Properties.  Click the Summary tab, and the Advanced button.  It should confirm that your file is 1024 x 768 with 72 dpi.  It is a good idea to get in the habit of keeping track of the properties of your image files.  My first view looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlgznYWlMI/AAAAAAAAAJQ/cgapiDIyQMs/s1600-h/mockup-view-whole.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlgznYWlMI/AAAAAAAAAJQ/cgapiDIyQMs/s200/mockup-view-whole.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285362077556708546" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;We now want to generate some views that are more representative of what the visitor would see.  First, we want a shot of the view from outside the entrance.  Choose Camera-&gt;StandardViews-&gt;Right and export a JPEG of the view.  Mine looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlg9Xu3XYI/AAAAAAAAAJY/4iUKiSznv8Q/s1600-h/mockup-view-entrance.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlg9Xu3XYI/AAAAAAAAAJY/4iUKiSznv8Q/s200/mockup-view-entrance.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285362245154856322" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As the visitor enters, I'm going to assume that his or her attention is drawn first to the projected image.  (At this point Bryce might be getting in the way.  You can try moving him around the space and orienting him appropriately, or simply drag him out of the way, which is what I did.)  Choose Zoom and set the field of view to 65 degrees.  Then choose Camera-&gt;StandardViews-&gt;Front.  We want to give more of a feeling of immersion with this view, so choose Camera-&gt;Walk, put the cross hairs on the front edge of the carpet and click.  Then you can use the UpArrow on the keyboard to move into the scene at the right eye level.  When you're happy export a JPEG.  Mine looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlhGP3BSOI/AAAAAAAAAJg/PgwXOfjDDzs/s1600-h/mockup-view-projection.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlhGP3BSOI/AAAAAAAAAJg/PgwXOfjDDzs/s200/mockup-view-projection.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285362397660399842" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Next we want to show a view of the display case. Click on the carpet near the kiosk, and use the UpArrow and LeftArrow on the keyboard to 'walk' around the scene until you get a good view.  When you've got it, export a JPEG.  Here's mine:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlkBor4o4I/AAAAAAAAAJo/nSKPUIIEZGU/s1600-h/mockup-view-case.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlkBor4o4I/AAAAAAAAAJo/nSKPUIIEZGU/s200/mockup-view-case.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285365616960119682" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Finally, I'm imagining that the visitor will check out the kiosk before moving into the next gallery.  Continue to use the Walk tool to move around the scene until you get a good view looking back at the kiosk.  Export a JPEG.  Mine looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlkKNLR37I/AAAAAAAAAJw/tXmOsrCQSWo/s1600-h/mockup-view-kiosk.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlkKNLR37I/AAAAAAAAAJw/tXmOsrCQSWo/s200/mockup-view-kiosk.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285365764194426802" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;We now have five images of our space to use in the presentation.  At this point they are still quite plain.  In the next part we will use The GIMP to modify these images to convey more of our vision for the exhibit. &lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7508171326769981463?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7508171326769981463'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7508171326769981463'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/11/how-to-make-museum-exhibit-mockup-with_13.html' title='How To: Make a Museum Exhibit Mockup with Free Tools, Part 2'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlgznYWlMI/AAAAAAAAAJQ/cgapiDIyQMs/s72-c/mockup-view-whole.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-8277166863795240914</id><published>2007-11-12T13:51:00.001-05:00</published><updated>2008-12-29T18:41:05.920-05:00</updated><title type='text'>How To: Make a Museum Exhibit Mockup with Free Tools, Part 1</title><content type='html'>Many people who make their way into public history find themselves in the position of having to impress a potential client without necessarily having many resources to do so.  They may need to submit a proposal for a museum exhibit, for example, without being able to afford the services of a graphic designer. While it's always nice to get professional assistance, it's also nice to know that you can use freely available tools to create something a little slicker than a sketch on the back of an envelope.  In this post and the next, I'll show you how to create a simple 3D model of an exhibit that you can build into a proposal or presentation.  For the purposes of this demonstration, I want to focus on the digital tools, so the exhibit that I describe will be completely made up.&lt;br /&gt;&lt;br /&gt;To follow along, you need to download and install two freely available programs, &lt;a href="http://sketchup.google.com/"&gt;Google SketchUp 6&lt;/a&gt; and The GNU Image Manipulation Program (&lt;a href="http://www.gimp.org/"&gt;GIMP&lt;/a&gt;).  Both are available for Windows and Macs.  Here I will give instructions for Windows; I assume the commands can be translated for Macs relatively easily.&lt;br /&gt;&lt;br /&gt;First, you have to establish the dimensions of both your potential output and your exhibit space.  Graphics files are typically described in terms of width, height and resolution.  A common size for presentation on screen is 1024 pixels wide, by 768 pixels high, with a resolution of 72 dots per inch (dpi).  If you want a larger or smaller image, keep the same resolution and the same 4:3 aspect ratio of width to height.  Common sizes are 1600 x 1200 pixels, 1400 x 1050, 1024 x 768, 800 x 600, 640 x 480, 320 x 240 or 160 x 120.  Newer monitors may have a different aspect ratio such as 5:4 or 16:9, but you are probably safest sticking with 4:3 unless you know what kind of monitor or projector your presentation will appear on.&lt;br /&gt;&lt;br /&gt;If you plan to print your image on paper, you need to create it with a higher resolution, typically at least 240 to 300 pixels per inch (ppi). One of the challenges of working with graphics is that something that looks good on your screen can be underwhelming when you print it out, especially if you are trying to make a poster.  Here I will assume we are creating something to be output on a computer screen.&lt;br /&gt;&lt;br /&gt;Try to get blueprints and photographs of the exhibit space if you can.  If not, make sure to get enough measurements that you can reconstruct the space.  For my made-up example, I'm going to say that my space is 15' x 20' with a 10' ceiling.  There is a 4' x 8' entrance in one of the 15' walls, and where one of the 20' walls would be there is actually an opening leading into another gallery.  I have a ceiling mounted LCD projector to display on the 20' wall, and a display case (10' long x 4' high x 2' deep) along the 15' wall without the door.  There will also be a free-standing kiosk with a 2' x 2' footprint that is 5' high, which I can move to an appropriate location in the room.&lt;br /&gt;&lt;br /&gt;Given the dimensions of the space, we can use SketchUp to create a rudimentary 3D model.  Start by using the Rectangle tool to draw the floor.  To get the dimensions right, click on the origin and move the cursor into the plane, drawing a rectangle.  You will see the dimensions in the lower right hand corner of the screen as you move the cursor.  Type in &lt;em&gt;20', 15'&lt;/em&gt; and press Enter.  The program should respond by drawing your floor.  You can check your work with the Tape Measure tool to make sure the dimensions are right.  Your mockup should look like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVle2BgLvdI/AAAAAAAAAII/647ExKJsjkM/s1600-h/mockup-floor.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVle2BgLvdI/AAAAAAAAAII/647ExKJsjkM/s200/mockup-floor.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285359919905357266" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Now we want to create the basic volume of the space.  Choose the Push/Pull tool (the one that looks like a block with a red arrow coming out the top), select the floor, hold down on the left mouse button and pull up a little bit.  A cube should extrude upwards.  Type in the distance (&lt;em&gt;10'&lt;/em&gt;) and press Enter.  We'll want to be able to see into the space, so use the Arrow to select the top face of the cube, right click and choose Erase.  If everything worked, you should see something like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVle_ORnSZI/AAAAAAAAAIQ/Sqi2TSE5i3o/s1600-h/mockup-box.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVle_ORnSZI/AAAAAAAAAIQ/Sqi2TSE5i3o/s200/mockup-box.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285360077952731538" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Since one of the 20' walls opens into another gallery, we can erase it, too.  Use the Rectangle tool to draw a 4' x 8' entry way in one of the 15' walls and use Move to slide it into the right position if necessary.  Use the Arrow to select the door, right click and Erase it.  Click the Zoom to Extents tool (a magnifying glass with four red arrows) and then type &lt;em&gt;45&lt;/em&gt; to get a 45-degree field of view.  When you press Enter the mockup should look something like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlfHomhabI/AAAAAAAAAIY/o_jwc8SM4Q8/s1600-h/mockup-space.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlfHomhabI/AAAAAAAAAIY/o_jwc8SM4Q8/s200/mockup-space.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285360222458702258" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Now we want to create our display case.  Unfortunately, the little dude who shows up by default is standing in the way.  (His name is "Bryce.")  Use the Arrow tool to select Bryce and then use Tools-&gt;Move to drag him out of the way.  Now use the Rectangle tool to draw the 10' x 2' footprint of the display case along the wall.  Use the Push/Pull tool to extrude it to a height of 4'.  Since the top 2' of our display case is made of glass, we use the Pencil tool to draw a line around the midpoint of the case.  The mockup now looks like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlfPThuwgI/AAAAAAAAAIg/Q9TWVYkM5Go/s1600-h/mockup-cabinet-1.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlfPThuwgI/AAAAAAAAAIg/Q9TWVYkM5Go/s200/mockup-cabinet-1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285360354240414210" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Next we use the Orbit tool to rotate the image around so that we can see the far side of the display case.  Using the Paint tool, paint the top surface, and the left, front, and right top halves with transparent blue glass.  Now that you can see into the case, it is obvious that we will need an internal platform to place artifacts on.  Draw one with the Rectangle tool.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlfYNXmMuI/AAAAAAAAAIo/ENc4ERzaiJg/s1600-h/mockup-cabinet-2.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlfYNXmMuI/AAAAAAAAAIo/ENc4ERzaiJg/s200/mockup-cabinet-2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285360507206120162" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Next we create the kiosk.  We use the Orbit and Zoom tools again to get a good view of where we want to put it.  Draw the 2' x 2' footprint with the Rectangle tool, and then extrude it to a height of 5' with the Push/Pull tool.  My imaginary kiosk looks a bit like a classic arcade-style video game.  We use the Rectangle tool to draw a Golden Section on the front face, then the Push/Pull tool to push it in about 4".&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlfiLtz6-I/AAAAAAAAAIw/KU4hoXI6zLw/s1600-h/mockup-kiosk-1.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlfiLtz6-I/AAAAAAAAAIw/KU4hoXI6zLw/s200/mockup-kiosk-1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285360678561115106" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This next bit is hard to describe, but easy enough once you get the hang of it.  By selecting the horizontal lines on the front of the kiosk, we can use the Move tool to push them in or out gently, thus sculpting the front.  If you do something too drastic, you can always Undo it.  When finished, mine looked like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlfq4QiyRI/AAAAAAAAAI4/wUbXeASXze8/s1600-h/mockup-kiosk-2.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlfq4QiyRI/AAAAAAAAAI4/wUbXeASXze8/s200/mockup-kiosk-2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285360827956906258" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Next we want to place our LCD projector on the ceiling and indicate where the image will be projected.  Rather than creating a projector model ourselves, we go to the Google 3D Warehouse and search for "projector".  &lt;a href="http://sketchup.google.com/3dwarehouse/details?mid=b68f3ebb2a34146ea609863dc598b66d&amp;prevstart=0"&gt;The one created by Rothmatic&lt;/a&gt; looks like what I had in mind, so we save it to disk and import it into the drawing.&lt;br /&gt;&lt;br /&gt;Use the Scale tool to make the projector the right size.  Then Move it into the center of the ceiling and Rotate it to the right orientation.  In this demonstration, I've just eyeballed it, but for a real exhibit you would want to know where the projector would be mounted, and how large and where exactly the image would be cast.  You would want to make sure that most visitors wouldn't walk through the beam.  You'd also have to worry about ambient lighting being high enough for comfort but not so high as to drown out the projected image.  In the interests of pedagogy, however, we're making this up and simplifying as we go along.  Use the Rectangle tool to draw the projected image on the wall, and the Pencil tool to draw lines from each corner of the projected image back to the projector lens.  If all your lines connect, you will end up with a pyramid-shaped solid, as shown in the next image.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlf20PLD-I/AAAAAAAAAJA/YmHssgiLuaI/s1600-h/mockup-projector-1.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlf20PLD-I/AAAAAAAAAJA/YmHssgiLuaI/s200/mockup-projector-1.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285361033035845602" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Now use the Arrow tool to select each face of the pyramid in turn, right-click and Erase them.  Use the Paint tool to make the walls off-white, and the carpet gray.  The final model should look like this:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlf_dOQvBI/AAAAAAAAAJI/_1dRyDuI9ns/s1600-h/mockup-projector-2.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlf_dOQvBI/AAAAAAAAAJI/_1dRyDuI9ns/s200/mockup-projector-2.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285361181476830226" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;We now have a basic 3D model which we can use to convey an impression of how the exhibit space will look.  If you'd like to load the model into SketchUp and play with it, a copy is &lt;a href="http://digitalhistory.uwo.ca/dhh/images/mockup.skp"&gt;here&lt;/a&gt;.  In the next part, we'll generate some screenshots of our space, and load them into The GIMP for further manipulation.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8277166863795240914?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8277166863795240914'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8277166863795240914'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/11/how-to-make-museum-exhibit-mockup-with.html' title='How To: Make a Museum Exhibit Mockup with Free Tools, Part 1'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVle2BgLvdI/AAAAAAAAAII/647ExKJsjkM/s72-c/mockup-floor.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4171798388072446351</id><published>2007-11-09T18:12:00.001-05:00</published><updated>2008-12-29T18:33:23.696-05:00</updated><title type='text'>History Appliances: Laser Spirograph</title><content type='html'>A sure sign of a 1970s childhood is a fond memory of doodling with the Kenner &lt;a href="http://www.journalofantiques.com/Jan07/playing_around.html"&gt;Spirograph&lt;/a&gt; toy. In the back of my mind I've been thinking it would be fun to build something like it into a history appliance.  You can already find &lt;a href="http://www.math.psu.edu/dlittle/java/parametricequations/spirograph/SpiroGraph1.0/index.html"&gt;software versions&lt;/a&gt; online, but I wanted something that could be used at the periphery, rather than the focus, of attention.  On a recent trip to &lt;a href="http://www.activesurplus.com/"&gt;Active Surplus&lt;/a&gt; in Toronto I realized I could build a version quite cheaply.  So here it is: a little too thrown together even to call a hack, this is really a &lt;a href="http://catb.org/jargon/html/K/kluge.html"&gt;kludge&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;I started by cutting a $1 laser pointer out of its casing and soldering some wires to it so I could switch it on and off electronically.  Here it is on a breadboard with a 5v voltage regulator.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVldRfpnRUI/AAAAAAAAAHI/oD3FgM7GTbI/s1600-h/Laser-Spirograph-laser.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVldRfpnRUI/AAAAAAAAAHI/oD3FgM7GTbI/s200/Laser-Spirograph-laser.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285358192831186242" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The laser shines on a mirror that is crazy-glued to a cylindrical piece of dense foam and mounted on the shaft of a motor.  I used Lego motors because I had a pair of them.  The reflection bounces off another mirror, similarly mounted, and is projected on to a screen made from a 3x5 card.  The motors tend to slip around when they are running so I put a rubber mat under each.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVldZ_z9iLI/AAAAAAAAAHQ/5KnEIK2YTN8/s1600-h/Laser-Spirograph-motors.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVldZ_z9iLI/AAAAAAAAAHQ/5KnEIK2YTN8/s200/Laser-Spirograph-motors.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285358338903476402" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVldiCFxjDI/AAAAAAAAAHY/Tmv1hAMthIQ/s1600-h/Laser-Spirograph-screen.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVldiCFxjDI/AAAAAAAAAHY/Tmv1hAMthIQ/s200/Laser-Spirograph-screen.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285358476954012722" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The motors are controlled by pulse width modulation, using a &lt;a href="http://www.phidgets.com/products.php?product_id=1060"&gt;Phidgets MotorControl LV&lt;/a&gt; board.  I used a wall wart to supply 9v for the motors.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVldqCPOU5I/AAAAAAAAAHg/lr_9R4SWBks/s1600-h/Laser-Spirograph-motorcontrol.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVldqCPOU5I/AAAAAAAAAHg/lr_9R4SWBks/s200/Laser-Spirograph-motorcontrol.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285358614432600978" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To be able to fiddle with the speed of each motor, I used the &lt;a href="http://www.phidgets.com/products.php?product_id=1018"&gt;Phidgets Interface Kit&lt;/a&gt;, the &lt;a href="http://www.phidgets.com/products.php?product_id=1113"&gt;mini joystick&lt;/a&gt; and a &lt;a href="http://cycling74.com/products/maxmsp"&gt;Max/MSP&lt;/a&gt; patch.  &lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVldzUhUPOI/AAAAAAAAAHo/BgP59cUv_8M/s1600-h/Laser-Spirograph-interfacekit.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVldzUhUPOI/AAAAAAAAAHo/BgP59cUv_8M/s200/Laser-Spirograph-interfacekit.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285358773959146722" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The whole setup looked like this.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVld77DIAxI/AAAAAAAAAHw/xmdqc9oArF8/s1600-h/Laser-Spirograph-setup.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVld77DIAxI/AAAAAAAAAHw/xmdqc9oArF8/s200/Laser-Spirograph-setup.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285358921740452626" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;With the motors both running, the dot of the laser pointer is perturbed by first one rotation, then the next, tracing out a familiar spirograph-style image.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVleFIMVvmI/AAAAAAAAAH4/8e8rV8kyFOU/s1600-h/Laser-Spirograph-big.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVleFIMVvmI/AAAAAAAAAH4/8e8rV8kyFOU/s200/Laser-Spirograph-big.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285359079887584866" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;As you vary the speed and direction of rotation for the two mirrors, you get a range of different patterns.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVleNhYC8nI/AAAAAAAAAIA/YIKiKPGWqLI/s1600-h/Laser-Spirograph-collected.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVleNhYC8nI/AAAAAAAAAIA/YIKiKPGWqLI/s200/Laser-Spirograph-collected.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285359224086524530" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;To demonstrate, I used the joystick to vary the parameters of the system.  In an application, it would be hooked up to streaming data instead.  Groovy!&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/nostalgia" rel="tag"&gt;nostalgia&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4171798388072446351?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4171798388072446351'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4171798388072446351'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/11/history-appliances-laser-spirograph.html' title='History Appliances: Laser Spirograph'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_mg_RqiBYrpE/SVldRfpnRUI/AAAAAAAAAHI/oD3FgM7GTbI/s72-c/Laser-Spirograph-laser.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1040812404149970756</id><published>2007-10-28T10:47:00.000-04:00</published><updated>2007-10-28T12:12:15.563-04:00</updated><title type='text'>Seams and the Suspension of Disbelief</title><content type='html'>At an unconference that I was at a few weeks ago, &lt;a href="http://www.lancs.ac.uk/fass/sociology/staff/suchman/suchman.htm"&gt;Lucy Suchman&lt;/a&gt; began a conversation about illusion and suspension of disbelief in technocultural systems.  The example that she gave was animation: we really buy into the actions of &lt;a href="http://www.felixthecat.com/IMG/multimedia/animations/felixwalk.gif"&gt;lovable cartoon characters&lt;/a&gt;, and readily attribute intentionality to them.  And yet, of course, there is nothing beneath the surface to match the imagined anthropomorph.  Such suspension of disbelief seems to follow quite readily when the details are right.  It's hard to look at the slowly pulsing LED of a "sleeping" PowerBook and not feel like the machine is a little bit more human for it.  &lt;br /&gt;&lt;br /&gt;Suspension of disbelief is something that historians strive for, too.  In public history, for example, costumed interpreters, museum dioramas, and replicas of artifacts and documents stand in for the originals that they are intended to resemble, although they may have little or no causal relationship to them.  The monographs of traditional history are also simulacra.  They bear a principled relationship to past events, but have rarely partaken of them.  Instead, their job is to put the reader into a some kind of relationship with the past, to get him or her to see through the physical reality of the codex.&lt;br /&gt;&lt;br /&gt;The discipline of citation allows sophisticated readers to assess the evidentiary material from which a particular account is constructed.  Each footnote serves as a kind of thread.  Pulling on it may tighten a seam or rip it open.  Professional historians expect the body of a work to be relatively smooth and tightly integrated, but they also expect to be able to use the footnotes to take it apart as necessary.  Ideally, monographs always present both smooth and seamful faces.&lt;br /&gt;&lt;br /&gt;In digital history, we have to pay attention to finding the right balance of smoothness and seamfulness, but we can work at a number of different levels, ranging from low-level electronic and hardware decisions to very high-level software abstractions.  It is possible for something to appear smooth at every level.  Carrying on the unconference conversation, &lt;a href="http://ckodonnell.blogspot.com/2007/10/rough-smooth-phidgets.html"&gt;Casey O'Donnell&lt;/a&gt; gives the examples of the iPod and Wii.  As he says, "these devices (mostly) just work," and have been designed to suppress tinkering.  It is possible, however, to construct systems that are smooth at one level and seamful at another, that signal their willingness to be hacked in particular ways.  (See the work of &lt;a href="http://www.dcs.gla.ac.uk/~matthew/DCS/Home.html"&gt;Matthew Chalmers&lt;/a&gt; for more on explicitly seamful design).  At the unconference, I demonstrated a simple musical instrument made from a distance sensor, Phidgets interface, a laptop running Max/MSP and a MIDI software synthesizer.  All of the seams were out in the open--in fact it was a bit messy.  But to play the instrument all you have to do is wave your hand in front of the sensor and you get glissandos of marimba notes.  At the behavioral level it is fun and responsive; at the hardware and software levels it is obviously hackable.&lt;br /&gt;&lt;br /&gt;As we develop historical projects online, we need to ask ourselves how we can incorporate tinkering while maintaining smoothness where we want it.  A great recent example of this is &lt;a href="http://devonelliott.blogspot.com/2007/10/wikiarchives.html"&gt;Devon Elliott's suggestion&lt;/a&gt; that archives use wiki technology to allow historians and other researchers to create item-level metadata.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/bricolage" rel="tag"&gt;bricolage&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/seamful+design" rel="tag"&gt;seamful design&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1040812404149970756?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1040812404149970756'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1040812404149970756'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/10/seams-and-suspension-of-disbelief.html' title='Seams and the Suspension of Disbelief'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-8278816598430569075</id><published>2007-10-21T09:22:00.000-04:00</published><updated>2007-10-21T11:03:50.888-04:00</updated><title type='text'>The Archive as Time Machine</title><content type='html'>[Cross-posted to Cliopatria &amp;amp; Digital History Hacks]&lt;br /&gt;&lt;br /&gt;Our story so far: even though we know that it's probably impossible, we've decided to think through the problem of building a time machine.  In the &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/08/some-varieties-of-time-machine-worth.html"&gt;last episode&lt;/a&gt; we decided that we wouldn't want one that allowed us to rewrite the past willy-nilly... because what would be the point of history then? It turned out, however, that the world itself is a pretty awesome time machine, tirelessly transporting absolutely everything into the future.  Today we look at the archive widely construed: one small portion of the world charged with the responsibility of preserving our collective representational memory.&lt;br /&gt;&lt;br /&gt;As every schoolboy used to know (at least back when there were 'schoolboys' who knew the Classics), Thucydides wanted his work to be "judged useful by those inquirers who desire an exact knowledge of the past as an aid to the interpretation of the future ... an everlasting possession, not the showpiece of an hour."  The fact that we know this twenty-five centuries later speaks pretty well for the potential of preserving representations for long periods of time.  Precisely because they can be readily transferred from one material  substratum to another, written words, well, remain.  Of course, since languages change over time there can be difficulties of decipherment or translation, and exactly which words survive can be a real crap shoot.&lt;br /&gt;&lt;br /&gt;With the relatively recent spread of optical, magnetic, and other media, it became necessary to archive media readers, too.  The endurance of the written word (or new cousins like photographs and phonographic records) now also depended on devices to amplify, transduce or otherwise transform signals into a form that is visible or audible to human users. Along with the obsolescence of media, librarians and archivists now had to worry about the obsolescence of reading devices.&lt;br /&gt;&lt;br /&gt;Enter the computer.  Representations are now being created in such quantity that the mind boggles, and they can be transformed into one another so easily that we've taken to referring to practically all media as simply "new."  This, of course, poses librarians and archivists with a class of problems we could also refer to as "new."   My students and I were talking about this in my &lt;a href="http://digitalhistory.uwo.ca/h513_0708/"&gt;digital history grad class&lt;/a&gt; a few weeks ago.  How do we store all of this born-digital material in a form that will be usable in the future, and not just the showpiece of an hour?  One possibility, technically sweet but practically difficult is to create &lt;span style="font-style:italic;"&gt;emulators&lt;/span&gt;.  The archive keeps only one kind of machine: a general-purpose computer that is Turing-equivalent to every other.  In theory, software that runs on the general-purpose machine can emulate any desired computer.&lt;br /&gt;&lt;br /&gt;My students are most familiar with systems that emulate &lt;a href="http://www.emulator-zone.com/"&gt;classic video and arcade games&lt;/a&gt;, so that framed our discussion.  One group was of the opinion that all you need is the 'blueprint' to create any technological system.  Another thought that you would be losing the experience of what it was like to actually use the original system.  (Here I should say that I'm solidly in the latter camp.  No amount of time spent on the &lt;a href="http://www.emulator-zone.com/doc.php/computer/ccs64.html"&gt;CCS64&lt;/a&gt; emulator can convey the experience of cracking open the Commodore 64 power transformer and spraying it with compressed air so it wouldn't overheat and crash the machine while you were hacking.)&lt;br /&gt;&lt;br /&gt;More than this, however, the idea that a blueprint is all you need to recreate a technical system shows how much more attention is focussed on the ghost than on the machine these days.  The showiness of new, endlessly plastic media obscure their crucial dependence on a systematic colonization of the nanoscale.  I might be able to read a microfiche with sunlight and some strong lenses, but never a DVD.  The blueprint for a DVD reader is completely useless without access to some of the most advanced fabrication techniques on the planet.  So we're in the process of creating all this eternally-new stuff, running on systems whose lifecycles are getting shorter every year.  What would Thucydides say?&lt;br /&gt;&lt;br /&gt;Next time: how and why to send messages way into the future. &lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/cliopatria" rel="tag"&gt;Cliopatria&lt;/a&gt; | &lt;a href="http://technorati.com/tag/gedankenexperiment" rel="tag"&gt;gedankenexperiment&lt;/a&gt; | &lt;a href="http://technorati.com/tag/time+machines" rel="tag"&gt;time machines&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8278816598430569075?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8278816598430569075'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8278816598430569075'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/10/archive-as-time-machine.html' title='The Archive as Time Machine'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-2165913330087093235</id><published>2007-10-12T14:45:00.000-04:00</published><updated>2007-10-12T16:52:47.018-04:00</updated><title type='text'>Unoriginal</title><content type='html'>I'm in Montreal this week participating in the &lt;a href="http://digitalhistory.concordia.ca/unconference/index.php/Main_Page"&gt;Playful Technocultures&lt;/a&gt; unconference and the annual meeting of &lt;a href="http://www.4sonline.org/"&gt;4S&lt;/a&gt;.  I've met a lot of interesting people, learned about what's been going on in &lt;a href="http://en.stswiki.org/index.php/Main_Page"&gt;STS&lt;/a&gt; since I last dropped in, and had a number of thought-provoking conversations.  For me, a lot of the discussion has centered on (artificial?) distinctions between play and work, on what makes something "serious" or not.  Today, for example, I had lunch with three RPI guys, &lt;a href="http://ishotthecyborg.blogspot.com/"&gt;Hector&lt;/a&gt;, &lt;a href="http://homepage.mac.com/codonnell/"&gt;Casey&lt;/a&gt; and &lt;a href="http://www.sts.rpi.edu/index.php?siteid=20&amp;pageid=209&amp;personID=349&amp;deptid=6&amp;pgid=9"&gt;Sean&lt;/a&gt;.  We were talking about the role of blogging in academic careers, how it is not yet valued for promotion or tenure, even though it is clearly a form of public engagement.  Many of us, in fact, have already found our online reputations and readership to be at least as beneficial as our published work in providing access to scholarly opportunities, funding, and other good stuff.  The academic perception of professional blogging is bound to change as a generation of academic bloggers becomes tenured, and committees begin to recognize that blogging may be fun, but it can also be work, that blogs can be about more than &lt;a href="http://www.lesaintbock.com/html/home.html"&gt;where you ate lunch&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Our discussion turned from there to the fact that bloggers tend to value substantive posts much more than short ones that link to other things of interest.  Sean noted, however, that given the sheer volume of stuff that comes through the feed reader every day, these link posts serve a useful "buzz" function... you tend to check out the pointers that recur in the blogs that you follow regularly.  In a sense, both the glut of information and the new value of "unoriginal" content (like link posts) are concomitants of the shift to what Roy Rosenzweig called the &lt;a href="http://www.historycooperative.org/journals/ahr/108.3/rosenzweig.html"&gt;culture of abundance&lt;/a&gt;.  There is way too much out there now to monitor by yourself; you really need other people to add their "me too" when someone thinks something is cool.  Think of these link posts as providing a gradient to the search space, so you or your bots have a better chance of finding spikes of interest.&lt;br /&gt;&lt;br /&gt;Don't get me wrong: I think originality can be a good thing, but I don't think that it's the only good thing.  The internet gives us instant access to the contents of the hive mind.  It's easy to find out that someone else has already had your brainwave, or done the hack that you were planning to try.  Don't let that stop you.  You have to play with other people's ideas, words, tropes, code, artifacts, instruments, and story lines to achieve any kind of mastery of anything.  Besides, historians are fond of pointing out that every new new thing actually has a long past [insert unoriginal allusion to Santayana here].  Sure the collective is doomed to repeat things, but how else could it memorize them?&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/buzz" rel="tag"&gt;buzz&lt;/a&gt; | &lt;a href="http://technorati.com/tag/gradient" rel="tag"&gt;gradient&lt;/a&gt; | &lt;a href="http://technorati.com/tag/play" rel="tag"&gt;play&lt;/a&gt; | &lt;a href="http://technorati.com/tag/social+memory" rel="tag"&gt;social memory&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-2165913330087093235?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2165913330087093235'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2165913330087093235'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/10/unoriginal.html' title='Unoriginal'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-544468387136341121</id><published>2007-09-27T12:33:00.000-04:00</published><updated>2007-09-27T13:11:04.170-04:00</updated><title type='text'>Brainstorming History Appliances</title><content type='html'>This year I've added a studio component to my &lt;a href="http://digitalhistory.uwo.ca/h513_0708/"&gt;graduate course in digital history&lt;/a&gt;, so the students have a chance to learn some of the fundamentals of interaction design and apply them to their work in public history.  In yesterday's class I gave them the task of brainstorming gadgets, appliances, devices, tools or toys that would somehow, magically "dispense history" or put their users in touch with the past in some other way.  (The assignment is &lt;a href="http://digitalhistory.uwo.ca/h513_0708/?page_id=5"&gt;here&lt;/a&gt;).  Many of the ideas that they came up with were really interesting.  In no particular order, here are some of my favorites.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Heritage knitting needles&lt;/strong&gt;.  Passed down within a family, these needles take on the ability to guide their user in the re-creation of any pattern they've been used for in the past.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Tangible spray&lt;/strong&gt;.  This comes in an aerosol can.  When you spray it in front of you, a grey mist appears.  You can reach into the mist and feel the past for a few moments.  When the mist dissolves, you're grasping thin air.  You might get hooked on such an experience, and buy one spray can after another.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;History hoe&lt;/strong&gt;. Use this in your garden to grow heirloom or extinct plant species.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Yelling documents&lt;/strong&gt;.  You put your primary source into a machine like a microfiche reader.  A stern, professorial face appears on the screen of the reader.  As you make interpretations about the document, the reader will berate you in a British accent if you get something wrong.  Hard to please, it only admits correct interpretations grudgingly, with harrumphing noises.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Reverse Babel Fish&lt;/strong&gt;. Put on this hearing aid, and everyone around you starts speaking in Old English.&lt;br /&gt;&lt;br /&gt;Not surprisingly, many of the ideas reworked themes familiar from fantasy or science fiction, like the talking genealogy hat a la Harry Potter (tilt the floppy brim to fast forward or reverse) or a "transporta-potty" that is like Dr. Who's phone booth (flush to reset).  Some of them related to subjects of perennial interest to students, like the cabinet that dispensed historical cocktails with music appropriate to the period.&lt;br /&gt;&lt;br /&gt;For me, part of the fun was asking the students beforehand about their interests, hobbies and skills.  I'm not sure what role we will find for talents that include piano playing, gardening, belly dancing, horse riding, knitting and snowboarding... but I'm glad to know their imaginations are in good working order.  There are links to all of their blogs on the course website.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-544468387136341121?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/544468387136341121'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/544468387136341121'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/09/brainstorming-history-appliances.html' title='Brainstorming History Appliances'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4456550769612949564</id><published>2007-09-17T09:38:00.000-04:00</published><updated>2007-09-17T11:43:01.290-04:00</updated><title type='text'>The Importance of Infrastructure</title><content type='html'>The architect Christopher Alexander is well-known for having developed the idea of &lt;a href="http://www.amazon.com/Timeless-Way-Building-Christopher-Alexander/dp/0195024028/"&gt;patterns&lt;/a&gt;, each of which "describes a problem which occurs over and over again in our environment, and then describes the core of the solution to that problem, in such a way that you can use this solution a million times over, without ever doing it the same way twice."  The idea was enthusiastically &lt;a href="http://www.amazon.com/Design-Patterns-Object-Oriented-Addison-Wesley-Professional/dp/0201633612/"&gt;adopted&lt;/a&gt; by software engineers and is now found in various forms throughout the digital realm. One very interesting manifestation is the recent report by Thorsten Haas, Lars Weiler and Jens Ohlig on design patterns for "&lt;a href="http://imakethings.com/Hacker-Space-Design-Patterns.pdf"&gt;Building a Hacker Space&lt;/a&gt;." &lt;br /&gt;&lt;br /&gt;This report is required reading for anyone who is interested in the transformative or disruptive potential of new technologies in academia.  It describes ways of creating and sustaining spaces for people to hack in, by providing a series of problems and solutions.  For example, "You have a chicken-and-egg problem: What should come first? Infrastructure or projects?" They suggest that you "Make everything infrastructure-driven.  Rooms, power, servers, connectivity, and other facilities come first. Once you have that, people will come up with the most amazing projects you didn't think about in the first place."&lt;br /&gt;&lt;br /&gt;This pattern fits in well with work in cognitive science that suggests that human reasoning and memory crucially depend on richly structured environments that are full of tools (e.g., &lt;a href="http://www.amazon.com/Cognition-Bradford-Books-Edwin-Hutchins/dp/0262581469/"&gt;Hutchins&lt;/a&gt;, &lt;a href="http://www.amazon.com/Natural-Born-Cyborgs-Technologies-Future-Intelligence/dp/0195177517/"&gt;Clark&lt;/a&gt;).   In my own research, I've found that modest environmental changes can have significant effects that I couldn't have anticipated.  When I was working in linguistic theory, for example, I had a chance to move into a new office where I installed a wall of whiteboards.  Being able to see a lot of diagrams spread out in front of me changed my understanding of the material, making it much more visual, and suggested different research questions.  When I was studying for my comprehensive exams, I splurged and bought an expensive ergonomic chair.  Up to that point, I had always thought I was a fidgety person (too much coffee and Coca Cola), but now I could sit still and read for 8 or 9 hours at a time.  I've recently had a chance to set up a study with my workbench directly behind my desk.  Now I can rotate my chair 180 degrees and there is the soldering station, Dremel, multi-meter, audio equipment, Phidgets, bins of components, and so on.  Having tools and supplies ready-to-hand makes it easier for me to imagine hacks that involve a hardware component.&lt;br /&gt;&lt;br /&gt;Tools cost money, however, and most grants are resolutely project-driven and fenced in by disciplinary boundaries.  In retrospect it is clear to me how whiteboards might make someone a better linguist, or a good chair a better student. I'm finding that a modest electronics lab is giving me a better understanding of the role of acoustics in history.  It's hard to imagine convincing a granting agency of these things.  Grants tend to be short-term and result-oriented.  If you don't know what the benefit will be, or can't relate a particular piece of equipment to a particular result it is hard to make a convincing case for spending the money.&lt;br /&gt;&lt;br /&gt;But some people get it.  A few months ago I heard a poignant story from a Canadian researcher.  He requested a lab that was tailored to his work, and ended up with an unsuitable boxy linoleum-floored room with computers facing four walls around the outside perimeter.  He doesn't like the space and neither do his students.  To make matters worse, a visiting researcher from Sweden showed them pictures of his lab, which looked like something from a design magazine.  "That looks like a wonderful space," my friend said wistfully, "I wouldn't mind just hanging out there."  "Yes," the visitor replied, "people come to hang out and end up working."  Many of my colleagues treat Starbucks or other cafes as workplaces, finding them much more salubrious than their alloted space.&lt;br /&gt;&lt;br /&gt;In &lt;em&gt;&lt;a href="http://www.amazon.com/Electric-Sound-Promise-Electronic-Music/dp/0133032310/"&gt;Electric Sound&lt;/a&gt;&lt;/em&gt;, Joel Chadabe describes the London digs of an early 1980s synthesizer company. "Opening a single large wooden door ... one entered a large foyer, bare except for beautiful paintings, Charles Rennie Mackintosh furniture, and a wooden table off to one side behind which sat a receptionist. There were demonstration areas upstairs ('by appointment only, of course'), and there was a large cafe downstairs, with resident cook and waiter, for staff and customers. In an adjacent building, there was a garage to which selected customers were given automatic door openers so they could privately park their cars."  'We didn't always make corporately sensible decisions,' one of the owners told Chadabe, 'but had we been accountants we wouldn't have done it at all.  You could buy innovative cutting edge technology in a private, comfortable environment. It was the sort of environment that we wanted to work in, so the natural assumption was that if we wanted to work there customers would want to come there. And they did. It was immensely successful.'&lt;br /&gt;&lt;br /&gt;It's a pattern you can see over and over in the histories of science, technology, and the arts: the right infrastructure attracts the right people and then something really cool happens.  But it isn't possible to predict in more detail than that.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/infrastructure" rel="tag"&gt;infrastructure&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4456550769612949564?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4456550769612949564'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4456550769612949564'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/09/importance-of-infrastructure.html' title='The Importance of Infrastructure'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4134565890779036729</id><published>2007-09-10T17:44:00.001-04:00</published><updated>2008-12-29T18:21:04.700-05:00</updated><title type='text'>Sounds Like Sepia</title><content type='html'>One of the things that I was working on over the summer was finding more ambient ways to communicate historical knowledge.  As part of that work, I've been recording soundscapes and have gotten in the habit of making a test recording outside my house before setting out.  When listening to these test recordings the other day I noticed something odd.  I could hear the sound of crickets and other insects, a distant car engine, snatches of music on the breeze, birds, the indistinct voices of children playing.  Even though I had recorded the track only a few days earlier, I felt like I could be listening to any summer morning in my lifetime.  Somehow the sounds had become unstuck from the time I recorded them and were lazily drifting around.  Or so it seemed.&lt;br /&gt;&lt;br /&gt;It's hard to investigate a feeling but I decided to give it a little more thought.  When we are able to see something that is making noise we readily correlate the sound with the object.  This can be disrupted a bit in cases where it takes the sound a noticeable amount of time to reach us, as when watching someone bat a baseball from a distance.  We judge the distance of a sound source that we can't see in part by its amplitude--quieter things tend to be farther away--and in part by the environmental coloration that the sound has undergone.  A sound that reaches our ears directly arrives there before reflections of that sound off of stuff in the environment. If the reflections arrive quickly, they change the timbre of the perceived sound.  If they arrive slowly they are perceived as echoes.  The ability to record a sound and play it back later makes it possible to create arbitrary temporal distance between the source of a sound and its perception.&lt;br /&gt;&lt;br /&gt;What I got to wondering was whether environmental coloration might be used to make a sound seem as if it were coming from a more distant past.  A visual analogy might be coloring a photograph with sepia tones (&lt;a href="http://www.niksoftware.com/colorefexpro/usa/entry.php?info=colorefexpro/samples//midsepia.shtml"&gt;cheesy&lt;/a&gt;) or filtering the image to simulate the aging of old film (&lt;a href="http://www.niksoftware.com/colorefexpro/usa/entry.php?info=colorefexpro/samples//opc.shtml"&gt;better&lt;/a&gt;).  In order to experiment, I set up the following hack.&lt;br /&gt;&lt;br /&gt;First, I built a simple circuit to generate a relatively pure tone using the handy &lt;a href="http://www.uoguelph.ca/~antoon/gadgets/555/555.html"&gt;555 timer&lt;/a&gt; and a few other components.  I set up a digital audio recorder right above the circuit and recorded a few seconds of irritating buzz.  The rig is shown in the photo below... the white thing on the speaker is half of a Nalgene container that I used for a resonator.  I've included the schematic in case you want to make your own.  You can lower the pitch by increasing the resistance of the 100K resistor and raise it by decreasing the resistance.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlawvlJd8I/AAAAAAAAAGg/q_DcPe3E4XM/s1600-h/ec-circuit.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlawvlJd8I/AAAAAAAAAGg/q_DcPe3E4XM/s200/ec-circuit.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285355431148484546" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVla3gvIiDI/AAAAAAAAAGo/qPZW8F8ibFs/s1600-h/ec-rig.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVla3gvIiDI/AAAAAAAAAGo/qPZW8F8ibFs/s200/ec-rig.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285355547422918706" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlbK7XtIuI/AAAAAAAAAG4/WCaSYvtu6Rg/s1600-h/ec-schematic.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlbK7XtIuI/AAAAAAAAAG4/WCaSYvtu6Rg/s200/ec-schematic.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285355880989926114" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;When I recorded the sound of the tone generator right above the circuit it sounded like  &lt;a href="https://sites.google.com/site/digitalhistoryhacks/Home/data-files/ec-close.wav"&gt;this&lt;/a&gt; (WAV file).  I then moved the recorder to a position about 10 feet away, near an open window.  Finally I moved it into another room about 20 feet away.  Using &lt;a href="http://audacity.sourceforge.net/"&gt;Audacity&lt;/a&gt;, I amplified both of these recordings so that the overall sound level was the same as the first one (-21 dB).  The recording made at an intermediate distance sounded like &lt;a href="https://sites.google.com/site/digitalhistoryhacks/Home/data-files/ec-med.wav"&gt;this&lt;/a&gt; (WAV file) and the one made at the far distance sounded like &lt;a href="https://sites.google.com/site/digitalhistoryhacks/Home/data-files/ec-far.wav"&gt;this&lt;/a&gt; (WAV file). Using the frequency analysis capability of Audacity, it is easy to see the effect of distance and noise on the subsequent recordings.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlbUGb0TpI/AAAAAAAAAHA/viUzVO-ayRo/s1600-h/ec-freq-analysis.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 96px; height: 200px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlbUGb0TpI/AAAAAAAAAHA/viUzVO-ayRo/s200/ec-freq-analysis.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285356038578785938" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Does environmental coloration make the sound seem more distant in time as well as space?  I'm not sure.  I thought I'd put it out there in case anyone else has the same perception, or wants to hack the hack.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/soundscapes" rel="tag"&gt;soundscapes&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4134565890779036729?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4134565890779036729'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4134565890779036729'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/09/sounds-like-sepia.html' title='Sounds Like Sepia'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlawvlJd8I/AAAAAAAAAGg/q_DcPe3E4XM/s72-c/ec-circuit.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-2200503454915048159</id><published>2007-09-01T12:03:00.000-04:00</published><updated>2007-09-01T12:59:01.155-04:00</updated><title type='text'>A Plea for Concept Projects</title><content type='html'>I'm the first person to admit that I know nothing about fashion, so I could be wrong about this... but my understanding is that the things to be seen on the runways of Paris, Milan or New York are not really intended to be worn.  The point of taking attractive (if emaciated) people and dressing them up like &lt;a href="http://www.telegraph.co.uk/fashion/main.jhtml?xml=/fashion/2006/07/06/wfash06.xml"&gt;samurai and astronauts&lt;/a&gt;, or &lt;a href="http://news.bbc.co.uk/1/hi/in_pictures/6290669.stm"&gt;festooning them with lanterns&lt;/a&gt;, is to stimulate the imagination.  Designers have a space to convey a sweeping vision without worrying too much about practicality, and their public is drawn to their more quotidian offerings by having a sense of a bigger picture.  Automotive engineers, too, have a tradition of creating &lt;a href="http://www.motortrend.com/future/concept_cars/"&gt;concept cars&lt;/a&gt;: one-of-a-kind prototypes that push the boundaries of particular forms, get ideas into circulation, and draw attention to the imagination and technical expertise of their creators.&lt;br /&gt;&lt;br /&gt;Academic historians (and many other humanists and social scientists) don't really have a tradition of creating projects that are not meant to be judged primarily in terms of utility or veracity.  But we can't complain that our research is treated as marginal unless we are willing to make some effort to put our thinking into forms that are of interest to other audiences than our close peers.  I'm not suggesting that scholarly traditions be weakened in any way, just that we create some new traditions.&lt;br /&gt;&lt;br /&gt;Digital scholarship puts very few restrictions on form and makes it easy to reach a potentially vast audience almost instantly.  And yet most new projects offer only incremental advances over the pre-digital state of the art, if that.  It's time to make some space for concept projects, to put work out there because it's visionary or beautiful or wacky or reflexive or just, as Thoreau put it, because it "affects the quality of the day."&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/transaction+costs" rel="tag"&gt;transaction costs&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-2200503454915048159?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2200503454915048159'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2200503454915048159'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/09/plea-for-concept-projects.html' title='A Plea for Concept Projects'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7437251229316478913</id><published>2007-08-27T16:10:00.000-04:00</published><updated>2007-08-27T16:40:16.590-04:00</updated><title type='text'>Some Varieties of Time Machine Worth Having</title><content type='html'>[Cross-posted to Cliopatria &amp;amp; Digital History Hacks]&lt;br /&gt;&lt;br /&gt;I've been invited to join the crack team of bloggers at &lt;a href="http://hnn.us/blogs/2.html"&gt;Cliopatria&lt;/a&gt;, so I will be cross-posting there and at &lt;a href="http://digitalhistoryhacks.blogspot.com"&gt;Digital History Hacks&lt;/a&gt; from time-to-time.  I'm excited by the opportunity to develop a series of posts on a topic of general interest to historians, while keeping enough technical content to satisfy my regular readers.  So... let's build a time machine!&lt;br /&gt;&lt;br /&gt;At some point in the early nineties I copied down a quote by Loren Eiseley in a commonplace book:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;A man who has once looked with the archaeological eye will never quite see normally again. He will be wounded by what other men call trifles. It is possible to refine the sense of time until an old shoe in the bunch of grass or a pile of nineteenth-century beer bottles in an abandoned mining town tolls in one's head like a hall clock. This is the price one pays for learning to read time from surfaces other than an illuminated dial. It is the melancholy secret of the artifact, the humanly touched thing. &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/exec/obidos/tg/detail/-/0803267355/"&gt;The Night Country&lt;/a&gt;&lt;/span&gt; 1971:81.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I made a note of the source, but not how I came upon it.  I know I wasn't reading Eiseley's work because I used to keep lists of the books that I read.  At the time I was studying linguistics and cognitive science, and in the early summer of 1994 I dipped into ecological anthropology.  I assume that I came across the quote then.  Now I don't really remember the context as clearly as it sounds.  I'm making inferences from my old notebooks and from &lt;a href="http://groups.google.com/"&gt;Usenet&lt;/a&gt; posts that have been archived online for 15 years.  Reading through those old posts reminds me of what I was doing at the time, although I remember being quite a bit cooler than some of my posts make me sound.  I wish that that were my own melancholy secret, but at some point in the 1990s I realized that everything that I had ever typed into a computer was going to be saved forever and eventually made available to everyone.&lt;br /&gt;&lt;br /&gt;The Eiseley quote stuck with me, and occasionally I would imagine what it would be like to have an 'archaeological eye.'  Being given more to science fiction than fantasy, I tended to imagine a mechanism or instrument or device of some sort, rather than a magical object like a crystal ball.  Now at this point I should probably stop and reassure you that I &lt;span style="font-weight:bold;"&gt;know&lt;/span&gt; that it may well be impossible to build a time machine in general, and that it is certainly impossible for me to build one.  But I think it can sometimes be quite productive to start with something that you know is impossible, and think through some of the implications anyway.  As a genre, fiction is ideally suited to this kind of gedankenexperiment; academic monographs less so.  Blogs lie somewhere in between.  As my fellow Cliopatrian Timothy Burke once &lt;a href="http://www.swarthmore.edu/SocSci/tburke1/perma12605.html"&gt;wrote&lt;/a&gt;, a blog is an ideal "place to publish small writings, odd writings, leftover writings, lazy speculations, half-formed hypotheses." Plus, time machines are a heck of a lot of fun.&lt;br /&gt;&lt;br /&gt;When most people think of a time machine, I suspect they probably imagine something like the &lt;a href="http://www.amazon.com/Time-Machine-Penguin-Classics/dp/0141439971/"&gt;H. G. Wells&lt;/a&gt; version: jump in, set the dial to whenever, hit a button and you are there.  This kind of time machine allows (or requires) you to alter the course of events.  Sometimes the results are tragic.  In the classic Ray Bradbury story "&lt;a href="http://www.amazon.com/Sound-Thunder-Other-Stories/dp/0060785691"&gt;A Sound of Thunder&lt;/a&gt;," one of the characters steps on a prehistoric butterfly and changes the future decidedly for the worse.  Sometimes the results are comic, as in Connie Willis's &lt;a href="http://www.amazon.com/Say-Nothing-Dog-Connie-Willis/dp/0553575384/"&gt;re-take&lt;/a&gt;  of Jerome K. Jerome.  A skeptic might point out that if this kind of time travel were ever going to be possible, we'd already be surrounded by people whizzing back from the future to take our fresh water or oxygen, or buy stock in Google, or exhort their younger selves to study harder, or whatever.  For historians, the real problem with being able to alter the past is that it would seem to allow for &lt;a href="http://www.imdb.com/title/tt0096928/"&gt;Bill &amp;amp; Ted-style rewriting&lt;/a&gt; on a grand scale, and thus make history utterly pointless.  The mutability of history, after all, crucially depends on the immutability of the past.&lt;br /&gt;&lt;br /&gt;In fact, physicists are split on the possibility of time travel.  Some of those who think time travel might be possible suggest that there could be some law of physics that prevents the creation of weird causal loops--you know, the kind where you go back in time to become your own great-great-grandfather or -mother.  Stephen Hawking, for example, postulates a "chronology protection conjecture."  (For more, see the &lt;a href="http://www.sciam.com/article.cfm?articleID=0004226A-F77D-1D4A-90FB809EC5880000"&gt;article&lt;/a&gt; by Paul Davies in &lt;span style="font-style:italic;"&gt;Scientific American&lt;/span&gt; or his subsequent &lt;a href="http://www.amazon.com/Build-Time-Machine-Paul-Davies/dp/0142001864/"&gt;book&lt;/a&gt;.)  So when I think of an 'archeological eye' I usually imagine something more voyeuristic: the ability to see or hear or in some way measure the events of the past without affecting the outcome.&lt;br /&gt;&lt;br /&gt;Years later, let's say around Y2K, I was studying history.  Reading Carlo Ginzburg's essay "&lt;a href="http://www.amazon.com/Clues-Myths-Historical-Method-Ginzburg/dp/080184388X/"&gt;Clues&lt;/a&gt;" reminded me of the Eiseley quote once again.  Wouldn't it be cool to write a history based on virtuoso readings of material evidence?  (Like Ginzburg, I read a lot of Sherlock Holmes as a kid.)  Unfortunately, the only thing that I was arguably a virtuoso at reading was books, and even that was a stretch.  Fortunately I was also reading the work of New Institutional Economists at the time.  My head was full of ideas of information costs and transaction costs.  Since it costs something to learn something, we can never know very much.  I had about the same chance of learning to read old shoes or nineteenth-century beer bottles as I did of learning to read sheet music: fairly low.  Choosing to specialize in reading one kind of material evidence would preclude learning to read an almost infinite number of other kinds of traces.&lt;br /&gt;&lt;br /&gt;What to do?  The key word is 'specialize'.  As with other kinds of work, there is a division of interpretive labor.  In order to make use of material trace evidence, you don't necessarily need to be able to read it yourself, you simply need to be able to find someone who can.  With the traditional tools of scholarship it would have been very difficult to assemble a synoptic view of other people's reconstructions of the past from physical evidence.  The emergence of search engines like Google drastically lowered those information costs, however.  If you type &lt;span style="font-weight:bold;"&gt;interpret "wear marks"&lt;/span&gt; into Google, you will find a reference to a 1958 paper in the &lt;span style="font-style:italic;"&gt;British Chiropody Journal&lt;/span&gt; on using shoe wear marks to diagnose foot troubles.  You'll find a white paper on how to use scattered light to assess surface and bulk defects in various materials, a paper on the use-wear of stone tools, and so on.  You'll find, in other words, a world of chiropodists, materials scientists, forensic scientists, engineers, archaeologists and thousands of other kinds of specialists busy reconstructing the past from its material traces.  These are people in search of usable past.  They care about past events because they have consequences in the present, and the only way they can access that past is by looking for its &lt;a href="http://www.arthist.lu.se/kultsem/encyclo/index.html"&gt;indexical signs&lt;/a&gt;.  These experts don't always agree with one another; the mutability of history also depends on the fact that learning is costly.  But since our environment is comprised entirely of survivals from the past, it is a kind of time machine, constantly transporting everything from some past into the present.  It is one kind of time machine that is worth having... even if it does seem to work in one direction only and is remarkably difficult to use.  (For more on the idea of the environment as an archive of material traces see my new book &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Archive-Place-Unearthing-Chilcotin-Plateau/dp/0774813768/"&gt;The Archive of Place&lt;/a&gt;&lt;/span&gt;.)&lt;br /&gt;&lt;br /&gt;Next time: the archive as time machine.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/cliopatria" rel="tag"&gt;Cliopatria&lt;/a&gt; | &lt;a href="http://technorati.com/tag/gedankenexperiment" rel="tag"&gt;gedankenexperiment&lt;/a&gt; | &lt;a href="http://technorati.com/tag/time+machines" rel="tag"&gt;time machines&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7437251229316478913?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7437251229316478913'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7437251229316478913'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/08/some-varieties-of-time-machine-worth.html' title='Some Varieties of Time Machine Worth Having'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4568781720758225668</id><published>2007-08-18T08:43:00.000-04:00</published><updated>2007-08-18T09:56:33.478-04:00</updated><title type='text'>Perpetual Analytics with Compression</title><content type='html'>Perpetual analytics is the process of comparing each new item of incoming information to the whole collection at the moment that it is received.  IBM scientist Jeff Jonas &lt;a href="http://jeffjonas.typepad.com/jeff_jonas/2006/02/what_do_you_kno.html"&gt;writes&lt;/a&gt;, "there is an ocean of historical data and it is raining, which is to say new data keeps being introduced ... Think of [perpetual analytics] like 'directing the rain drops' as they fall into the ocean – placing each drop in the right place and measuring the ripples (i.e., finding relationships and relevance to the historical knowledge).  Discovery is made during ingestion and relevant insight is published at that magical moment."  Jonas contrasts this approach with the more traditional process of creating isolated, specialized databases to hold different kinds of information.  Over time, these databases tend to become 'silos': many interesting things might be discovered if the information within them could be integrated, but the &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/04/information-costs.html"&gt;information costs&lt;/a&gt; are too high to do so.&lt;br /&gt;&lt;br /&gt;The most powerful implementation of this idea (not to mention the most difficult) would be general-purpose mining at the scale of the internet.  I'll leave that for Google or IBM.  Instead, I'm going to describe a special-purpose system that operates in a very restricted and small domain.&lt;br /&gt;&lt;br /&gt;Imagine browsing through a collection of online primary sources that may be relevant for your research.  They could be diary entries, historic newspaper articles or parliamentary records.  As you navigate to each new page, a set of links appears in the right sidebar, the way that sponsored advertisements appear in Google search results.  Instead of being ads, however, these are links to related primary and secondary sources.  If you are reading a letter, for example, there may be links in the sidebar to biographies of the author, recipient or people mentioned in the text.  There may be links to other letters written by these people, or to other letters written at the same time and place.  If some known event is being described, there may be links to historical accounts of that event.  And so on.  If you click on one of these sidebar links, a new tab opens in your browser with that source displayed in it, and with links to other sources that are related to it.  The sidebar provides ambient information that may be useful without distracting you from the task at hand.&lt;br /&gt;&lt;br /&gt;This recommendation system has two very useful features: it is generated automatically and it gets smarter as you use it.  Here's what is going on behind the scenes.  When you browse to a page, the system stores a copy of the text in a database.  If it is the first page you've ever looked at, nothing else happens.  When you go to the second page, however, it stores a copy of the text, then uses the normalized compression distance (NCD) to determine how similar the two pages are. (For more on the NCD, see my &lt;a href="http://digitalhistoryhacks.blogspot.com/search?q=ncd"&gt;earlier posts&lt;/a&gt;.)  As you browse to each new page, a copy is added to the database, and the NCD is calculated for that page and every other that one you've already visited.  The sidebar displays links to the closest ones already in the database.&lt;br /&gt;&lt;br /&gt;As described so far, this system is able to cluster your own reading, always showing you links to the most relevant stuff that you've already seen.  In order to be really useful, you can seed the database with source collections that are likely to be relevant but are too large to be read systematically.  For example, if you are working in a particular national and temporal context, you might add all of the entries from a dictionary of historical biography.  If you are working in a particular place, you might add complete runs of local newspapers.  For specific fields you could add runs of scholarly journals.  For groups of people you could add correspondence and diaries.&lt;br /&gt;&lt;br /&gt;Furthermore, the system scales up powerfully for collaborative research if the database is shared by everyone working on a particular subject.  As each person finds something of interest, it immediately becomes available for recommendation to any of the others, depending on what they are looking at.  Built on top of a server-backed version of &lt;a href="http://www.zotero.org/"&gt;Zotero&lt;/a&gt;, this tool provides one path to leveraging the power of &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/12/pedagogy-for-collective-intelligence.html"&gt;collective intelligences&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/browser" rel="tag"&gt;browser&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+compression" rel="tag"&gt;data compression&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/kolmogorov+complexity" rel="tag"&gt;Kolmogorov complexity&lt;/a&gt; | &lt;a href="http://technorati.com/tag/perpetual+analytics" rel="tag"&gt;perpetual analytics&lt;/a&gt; | &lt;a href="http://technorati.com/tag/zotero" rel="tag"&gt;Zotero&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4568781720758225668?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4568781720758225668'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4568781720758225668'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/08/perpetual-analytics-with-compression.html' title='Perpetual Analytics with Compression'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-926943042810446471</id><published>2007-07-31T09:39:00.000-04:00</published><updated>2007-07-31T11:40:08.245-04:00</updated><title type='text'>Putting It in Your Own Words</title><content type='html'>When we teach history students how to take notes for research, we usually tell them to take down direct quotes sparingly, and to put things in their own words instead.  Many university writing labs provide training in the art of &lt;a href="http://owl.english.purdue.edu/handouts/research/r_paraphr.html"&gt;paraphrasing&lt;/a&gt;.  One concern is that direct quotes lend themselves to witting or unwitting plagiarism, especially if the paper is being written the night before it's due.&lt;br /&gt;&lt;br /&gt;I've always found paraphrasing to be an unsatisfactory exercise because it is in direct tension with close reading.  You read the original passage carefully, set it to one side, and then write out the ideas in your own words.  At that point you're supposed to re-read the original passage and make sure that you captured the essence.  Of course you didn't.  As Mark Twain once said, "The difference between the &lt;span style="font-style:italic;"&gt;almost&lt;/span&gt;-right word &amp;amp; the &lt;span style="font-style:italic;"&gt;right&lt;/span&gt; word is really a large matter -- it's the difference between the lightning-bug and the lightning." [&lt;a href="http://www.amazon.com/Yale-Book-Quotations-Joseph-Epstein/dp/0300107986/"&gt;*&lt;/a&gt;] If a student came to me with this example, I'd tell them that there are times when you really should quote rather than paraphrase.  &lt;br /&gt;&lt;br /&gt;In fact, when I'm taking notes, I usually write down a lot of direct quotes.  When I go back to them later, I find that the author's exact words serve as much better reminders of his or her work than paraphrases do.  And when I write my first draft of anything, I usually have a lot more quotes than I'm going to want to have in the final version.  I know that I'm going to re-read and re-write each passage dozens of times, and that all but the best quotes will be squeezed out in the process.&lt;br /&gt;&lt;br /&gt;The problem of putting something in your own words is paralleled in machine learning by a problem known as overfitting.  Suppose you work on the production line of a company that makes delicious little chocolates with multi-colored candy shells [&lt;a href="http://www.nestle.ca/en/products/brands/smarties/index"&gt;Cdn&lt;/a&gt;|&lt;a href="http://us.mms.com/us/"&gt;US&lt;/a&gt;].  Even though all of the candies taste the same, your company has come to the conclusion that people pay attention to the color ... they have marketing campaigns based on a preference for eating the red ones last, or the ability to customize the color, or whatever.  Your job is to look at the candies  as they go by and sort them by color, tossing out any that don't match one of the approved shades.  (Sometimes the coloring machine malfunctions and you end up with colors that are more appropriate to your &lt;a href="http://jellybelly.com/Cultures/en-US/Shop/CandyDetails.htm?CS_ProductID=1064337&amp;CS_Category=BertieBotts&amp;CS_Catalog=B2C"&gt;competitor&lt;/a&gt;.) Now any hacker in this situation is going to build a robot, so you do.  As the candies come down the line, the robot tries to sort them and you provide feedback.  If you don't provide enough training, the robot might decide that all of the candies are either blue or red.  It is right some of the time, but not enough.  That is known as &lt;span style="font-style:italic;"&gt;underfitting&lt;/span&gt;.  If you provide it with too much training on a limited set of examples, it might be correct 100 percent of the time for those examples, but at the cost of memorizing too much detail.  Suppose you see five candies in a row, and categorize each as blue.  To simplify quite a bit, things that we call "blue" have a wavelength around 475 nanometers.  Your robot, however, comes up with five very specific rules: IF WAVELENGTH = 460.83429nm THEN COLOR = blue; IF WAVELENGTH = 483.00089nm THEN COLOR = blue; and so on.  Once you turn it loose on a new batch of candies, it is going to start malfunctioning, because it learned too much detail about your original set of examples.  It doesn't know what to do if the wavelength is 460.84000nm. This is the problem of &lt;span style="font-style:italic;"&gt;overfitting&lt;/span&gt;.  Now there are a lot of sophisticated &lt;a href="http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-3.html"&gt;methods&lt;/a&gt; for avoiding these problems if you are forced to model a limited data set.  But the best way to avoid them is to use a lot of training data.&lt;br /&gt;&lt;br /&gt;Which brings us back to putting things in your own words.  The problem that students encounter with note-taking doesn't have as much to do with quoting vs. paraphrasing as you might think.  The problem has to do with not looking at enough sources.  If you only consult a handful of sources, then direct quoting might lead you to plagiarism, which would be a case of overfitting.  If you paraphrase a handful of sources instead, you may avoid plagiarism but your essay isn't going to be any more nuanced.  That is going to lead to underfitting.  Either way, a model of a small number of sources is bound to be a bad predictor for the sources that you didn't consult.  The only way out is to read more... a lot more.  (See my earlier post on "&lt;a href="http://digitalhistoryhacks.blogspot.com/2006/11/difference-that-makes-difference.html"&gt;The Difference That Makes a Difference&lt;/a&gt;.")&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/convergence" rel="tag"&gt;convergence&lt;/a&gt; | &lt;a href="http://technorati.com/tag/machine+learning" rel="tag"&gt;machine learning&lt;/a&gt; | &lt;a href="http://technorati.com/tag/overfitting" rel="tag"&gt;overfitting&lt;/a&gt; | &lt;a href="http://technorati.com/tag/reading" rel="tag"&gt;reading&lt;/a&gt; | &lt;a href="http://technorati.com/tag/writing" rel="tag"&gt;writing&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-926943042810446471?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/926943042810446471'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/926943042810446471'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/07/putting-it-in-your-own-words.html' title='Putting It in Your Own Words'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-9087016170449122332</id><published>2007-07-21T10:02:00.000-04:00</published><updated>2007-07-21T17:46:26.484-04:00</updated><title type='text'>Import-Export Specialists</title><content type='html'>James Clifford once said in an interview that he "often function[s] as a kind of import-export specialist between the disciplines" [&lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Edges-Anthropology-Interviews-Prickly-Paradigm/dp/0972819606/"&gt;On the Edges of Anthropology&lt;/a&gt;&lt;/span&gt;, 55].  I think it's a great description of a particular kind of academic work: finding an idea, tool or technique that is well understood in one context and putting it to use in another.  It has particular relevance for the practice of public history.&lt;br /&gt;&lt;br /&gt;While thinking about ways of enriching historical practice with digital sources and computation, I've had a lot of occasion to draw on programming, machine learning, and statistical linguistics. In part, these choices reflect my own interests and training before I became a historian.  More than that, they're pretty obvious places to look for inspiration.  In many ways, digital history is still very textual.  It highlights the act of reading, most tools are designed to augment reading or serve as surrogates for it, and outputs are almost always textual in turn.  This is as it should be.  Most historians (myself included) love to read.  Academic history will remain a primarily textual discipline for the foreseeable future.&lt;br /&gt;&lt;br /&gt;As I've begun to explore the idea of creating devices and environments that convey a more ambient sense of the past, however, I've had to look a bit further afield for my imports, finding many opportunities to learn from people involved in interaction design, robotics, performance and electronic music.  These scholars are often disciplinary import-export specialists in their own right.  If you have some time this summer to spend hacking history appliances, here are some good starting points.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Interaction design&lt;/span&gt;.  Try Bill Moggridge's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.designinginteractions.com/"&gt;Designing Interactions&lt;/a&gt;&lt;/span&gt; and Dan Saffer's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.designingforinteraction.com/"&gt;Designing for Interaction&lt;/a&gt;&lt;/span&gt;.    &lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Robotics&lt;/span&gt;.  The behavior-based approach of &lt;a href="http://www.amazon.com/Cambrian-Intelligence-Early-History-New/dp/0262522632/"&gt;Rodney Brooks&lt;/a&gt; and his colleagues starts with simple but fully functional creatures interacting with the real world.  More complicated systems are built by adding layers of control which subsume lower-level functionality.  This strategy lends itself to designing robust interactions between people and history appliances, as I will show in detail in a future post.  The related &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/JunkBots-Bugbots-Bots-Wheels-Technology/dp/0072226013/"&gt;Junkbots, Bugbots and Bots on Wheels&lt;/a&gt;&lt;/span&gt; is a good source of ideas and techniques.&lt;br /&gt;&lt;br /&gt;I also really enjoy reading the blog of &lt;a href="http://ashishrd.blogspot.com/"&gt;Ashish Derhgawen&lt;/a&gt;, who comes up with some very creative hacks on a fairly limited budget.  This summer he's already figured out a way to use his cellphone as a remote door opener, written a program that can play the classic video game Pong by watching the screen with a webcam, and given one of his robots the ability to respond to claps and whistles.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Performance&lt;/span&gt;. The best book that I've found so far for hooking up sensors and actuators to your computer is Tom Igoe and Dan O'Sullivan's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Physical-Computing-Sensing-Controlling-Computers/dp/159200346X/"&gt;Physical Computing&lt;/a&gt;&lt;/span&gt;.  Both of the authors are associated with NYU's Interactive Telecommunications Program, and their focus on live events makes their work particularly useful for people who want to design experiences.  The fact that they usually teach artists rather than engineers makes for a very readable work.  Igoe's physical computing &lt;a href="http://tigoe.com/pcomp/index.shtml"&gt;website&lt;/a&gt; is also a great resource.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight:bold;"&gt;Electronica&lt;/span&gt;.  I like listening to electronic music, but hadn't learned anything about it until quite recently.  What I've read about its history suggests that it is quite common for electronic musicians to spend a fair amount of their time building new instruments and exploring their creative possibilities.  The &lt;a href="http://www.cycling74.com/"&gt;Cycling '74&lt;/a&gt; website has an interesting collection of resources, including videos, interviews and tutorials.  The &lt;a href="http://www.createdigitalmusic.com/"&gt;Create Digital Music&lt;/a&gt; webzine is also full of useful stuff.  For me, electronica is Ultima Thule: so far out there that I have a hard time finding my most trusted landmarks (i.e., good books on the subject).  Pinch and Trocco's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Analog-Days-Invention-Impact-Synthesizer/dp/0674016173/"&gt;Analog Days&lt;/a&gt;&lt;/span&gt; is an exception. &lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/bricolage" rel="tag"&gt;bricolage&lt;/a&gt; | &lt;a href="http://technorati.com/tag/electronica" rel="tag"&gt;electronica&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interdisciplinarity" rel="tag"&gt;interdisciplinarity&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-9087016170449122332?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9087016170449122332'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9087016170449122332'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/07/import-export-specialists.html' title='Import-Export Specialists'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4745934106948787761</id><published>2007-07-19T16:54:00.001-04:00</published><updated>2008-12-29T18:08:05.432-05:00</updated><title type='text'>History Appliances: Spöka</title><content type='html'>On a recent trip to Ikea I came across this &lt;a href="http://www.ikea.com/ca/en/catalog/products/90064441"&gt;awesome little dude&lt;/a&gt;.  They're selling Spöka as "children's lighting," but it was pretty clear to me that it was one hack short of a history appliance.  It has a rechargeable battery, so that you can use it without it being plugged in.  If you slide off the rubber skin, there is a light-bulb-shaped plastic housing inside.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlXqeZSebI/AAAAAAAAAFw/_nJViXMlt4c/s1600-h/spoka-001-1024.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlXqeZSebI/AAAAAAAAAFw/_nJViXMlt4c/s200/spoka-001-1024.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285352024921242034" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The designer thoughtfully created a case which can be opened into three parts and reassembled with nothing more than a small screwdriver.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlXzsHcPRI/AAAAAAAAAF4/8U9MwD8wU_c/s1600-h/spoka-002-1024.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlXzsHcPRI/AAAAAAAAAF4/8U9MwD8wU_c/s200/spoka-002-1024.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285352183223303442" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;On the top you'll find a simple push button toggle to turn it on and off.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlX6RERX0I/AAAAAAAAAGA/fjmY6u29pXM/s1600-h/spoka-003-1024.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlX6RERX0I/AAAAAAAAAGA/fjmY6u29pXM/s200/spoka-003-1024.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285352296221335362" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;We want to be able to control the light with the computer, however, so I interrupted the power supply by cutting the circuit to the battery and soldering in a pair of wires (the blue ones).  I put a bit of heat-shrink tubing over the joints to make them more resilient.  I also knotted the wires to provide strain relief where they will emerge from the case.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlYCKMa7nI/AAAAAAAAAGI/BA25J7m0frQ/s1600-h/spoka-005-1024.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlYCKMa7nI/AAAAAAAAAGI/BA25J7m0frQ/s200/spoka-005-1024.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285352431815421554" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;When the case is reassembled, the wires can be fed out of the top of the hole where the recharging plug goes in.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlYIZB2fCI/AAAAAAAAAGQ/QkoNexUm5xk/s1600-h/spoka-007-1024.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlYIZB2fCI/AAAAAAAAAGQ/QkoNexUm5xk/s200/spoka-007-1024.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285352538876836898" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;After you slide the rubber skin back on, you have an LED-lamp that can be controlled by your computer.  If you want to wire it up directly, you might use your parallel port, like Eric Wilhelm does for the &lt;a href="http://www.makezine.com/03/halloween/"&gt;haunted house controller&lt;/a&gt; in &lt;span style="font-style:italic;"&gt;Make&lt;/span&gt; volume 3.  Instead, I incorporated it into my standard history appliance rig, which uses &lt;a href="http://phidgets.com/"&gt;Phidgets&lt;/a&gt; controlled by &lt;a href="http://www.cycling74.com/"&gt;Max/MSP&lt;/a&gt;.  &lt;br /&gt;&lt;br /&gt;For a quick demo project, I created a browser that lets me look through historic newspaper articles about s&amp;eacute;ances from the online &lt;span style="font-style:italic;"&gt;Globe and Mail&lt;/span&gt; archive.  While browsing the stories from a particular time period, Sp&amp;ouml;ka flashes gently in the background, faster if there are a lot of them, slower if not.  It provides a nice peripheral feel for the intensity of Spiritualist activity at that point in time.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlYRs3zSFI/AAAAAAAAAGY/vMHx-RjyRVA/s1600-h/spoka-009-1024.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlYRs3zSFI/AAAAAAAAAGY/vMHx-RjyRVA/s200/spoka-009-1024.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285352698822215762" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/max+msp+jitter" rel="tag"&gt;Max/MSP/Jitter&lt;/a&gt; | &lt;a href="http://technorati.com/tag/phidgets" rel="tag"&gt;phidgets&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4745934106948787761?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4745934106948787761'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4745934106948787761'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/07/history-appliances-spka.html' title='History Appliances: Spöka'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlXqeZSebI/AAAAAAAAAFw/_nJViXMlt4c/s72-c/spoka-001-1024.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-9050847547475777273</id><published>2007-07-02T17:05:00.001-04:00</published><updated>2008-12-29T18:03:31.991-05:00</updated><title type='text'>Search Refinement with Compression</title><content type='html'>A few days ago I described a way of using Cilibrasi and Vit&amp;aacute;nyi's Normalized Compression Distance (NCD) to automatically &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/06/clustering-with-compression.html"&gt;cluster bibliographic entries&lt;/a&gt; from the online &lt;span style="font-style:italic;"&gt;Dictionary of Canadian Biography&lt;/span&gt;.  A compression algorithm keeps track of redundancies when it is compressing a string.  If those redundancies also occur in another string, then the two strings have something in common (i.e., the redundancies).  The NCD ranges from 0 (if the two strings are identical) to 1 (if there is absolutely no overlap).  Details are in the original article and laid out in one of my &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/03/compression.html"&gt;earlier posts&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Compression can also be used to automatically refine searches.  Suppose you are interested in the explorer Martin Frobisher.  If you type "Frobisher" into Yahoo! some of the first few pages of hits are relevant and some are not.  Usually you have to wade through the results (or specify more search keywords and hope you don't eliminate something interesting by being too specific.)&lt;br /&gt;&lt;br /&gt;An alternate strategy is to enter a broad search keyword (e.g., "Frobisher") and use the NCD to automatically compare the summary that Yahoo! returns for each hit with a "probe" text such as Frobisher's &lt;span style="font-style:italic;"&gt;DCB&lt;/span&gt; entry.  A short Python program to do exactly that is listed &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ncd-probe.py.html"&gt;here&lt;/a&gt;.  The search engine results can then be ranked according to increasing NCD from the probe text.&lt;br /&gt;&lt;br /&gt;The figure below shows the first 31 of 50 hits for "Frobisher" before and after this search refinement process.  I used red font to indicate the irrelevant results. As can be seen, this use of compression and a probe text does a good job of floating the relevant hits to the top of the pile.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlXNobsDFI/AAAAAAAAAFo/jKXQXM3t2AY/s1600-h/frobisher-ncd-graph.png"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 154px; height: 200px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlXNobsDFI/AAAAAAAAAFo/jKXQXM3t2AY/s200/frobisher-ncd-graph.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5285351529399454802" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/data+compression" rel="tag"&gt;data compression&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/dictionary+of+canadian+biography" rel="tag"&gt;Dictionary of Canadian Biography&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/kolmogorov+complexity" rel="tag"&gt;Kolmogorov complexity&lt;/a&gt; | &lt;a href="http://technorati.com/tag/search" rel="tag"&gt;search&lt;/a&gt; | &lt;a href="http://technorati.com/tag/visualization" rel="tag"&gt;visualization&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-9050847547475777273?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9050847547475777273'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9050847547475777273'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/07/search-refinement-with-compression.html' title='Search Refinement with Compression'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlXNobsDFI/AAAAAAAAAFo/jKXQXM3t2AY/s72-c/frobisher-ncd-graph.png' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-4821836714501621616</id><published>2007-06-27T15:36:00.001-04:00</published><updated>2008-12-29T18:01:08.101-05:00</updated><title type='text'>Clustering with Compression</title><content type='html'>Last spring I posted short piece about Rudi Cilibrasi and Paul Vit&amp;aacute;nyi's use of &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/03/compression.html"&gt;compression&lt;/a&gt; as a universal method for clustering ["Clustering by Compression," &lt;span style="font-style:italic;"&gt;IEEE Transactions on Information Theory&lt;/span&gt; 51, no. 4 (2005): 1523-45, &lt;a href="http://homepages.cwi.nl/~paulv/papers/cluster.pdf"&gt;PDF&lt;/a&gt;].  The basic idea is that a compression algorithm makes a string shorter by keeping track of redundancies and eliminating them.  Suppose you have two strings, &lt;span style="font-style:italic;"&gt;x&lt;/span&gt; and &lt;span style="font-style:italic;"&gt;y&lt;/span&gt;.  If there is some overlap between them, then the concatenated and compressed string &lt;span style="font-style:italic;"&gt;xy&lt;/span&gt; should be smaller than the concatenation of separately compressed strings &lt;span style="font-style:italic;"&gt;x&lt;/span&gt; and &lt;span style="font-style:italic;"&gt;y&lt;/span&gt;. (There are more details in my earlier post).  Cilibrasi and Vit&amp;aacute;nyi formalized this idea as the Normalized Compression Distance (NCD).&lt;br /&gt;&lt;br /&gt;In my earlier post I selected a handful of entries from the &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.biographi.ca/EN/index.html"&gt;Dictionary of Canadian Biography&lt;/a&gt;&lt;/span&gt; and submitted them to an open source clustering program that Cilibrasi and Vit&amp;aacute;nyi had provided.  I chose the people that I did because I already had some idea of how I would group them myself. The results of the automated clustering were very encouraging, but I hadn't had a chance to follow up with compression-based clustering until now.&lt;br /&gt;&lt;br /&gt;This time around, I decided to do a more extensive test.  I wrote a Python &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ncd-dcb.py.html"&gt;program&lt;/a&gt; to randomly select 100 biographies from volume 1 of the &lt;span style="font-style:italic;"&gt;DCB&lt;/span&gt; and compute the NCD between each pair.  A second Python &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ncd-dcb-graph.py.html"&gt;program&lt;/a&gt; combed through the output file of the first to automatically create a Graphviz &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/ncd-dcb-graph.txt"&gt;script&lt;/a&gt; that plots all connections below a user-specified threshold.  The resulting graph is shown below for NCDs &amp;lt; 0.77. &lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlV1540RGI/AAAAAAAAAEw/20KkrTkezFI/s1600-h/ncd-graph-00.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 146px; height: 200px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlV1540RGI/AAAAAAAAAEw/20KkrTkezFI/s200/ncd-graph-00.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285350022256542818" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Looking at different parts of the figure in turn, it becomes clear that this is a remarkably powerful technique, especially considering that the code is very simple and the algorithm has no domain-specific knowledge whatsoever.  It knows &lt;span style="font-style:italic;"&gt;nothing&lt;/span&gt; about history or about the English language, and yet it is able to find connections among biographical entries that are meaningful to a human interpreter.&lt;br /&gt;&lt;br /&gt;One isolated chain of biographies consists of people active in Hudson Bay in the late seventeenth century.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlV914abCI/AAAAAAAAAE4/6qzHqv_Tkbs/s1600-h/ncd-graph-01.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlV914abCI/AAAAAAAAAE4/6qzHqv_Tkbs/s200/ncd-graph-01.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285350158620060706" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;A second isolated cluster consists mostly of Englishmen who settled on the coasts of Newfoundland in the early 1600s.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlWFZMsScI/AAAAAAAAAFA/oJuPJLpFpRU/s1600-h/ncd-graph-02.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlWFZMsScI/AAAAAAAAAFA/oJuPJLpFpRU/s200/ncd-graph-02.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285350288359442882" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The main cluster is composed almost entirely of people based in various parts of Quebec in the seventeenth century.  There is an arm of Acadian settlers, some of whom spent time in Quebec.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlWN6UJ7pI/AAAAAAAAAFI/1mTElRvThrU/s1600-h/ncd-graph-03.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlWN6UJ7pI/AAAAAAAAAFI/1mTElRvThrU/s200/ncd-graph-03.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285350434688069266" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;There is a somewhat puzzling arm that I haven't really figured out.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlWWEUFLmI/AAAAAAAAAFQ/Ybk_Q1jIw74/s1600-h/ncd-graph-04.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://1.bp.blogspot.com/_mg_RqiBYrpE/SVlWWEUFLmI/AAAAAAAAAFQ/Ybk_Q1jIw74/s200/ncd-graph-04.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285350574811065954" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;And there is the main body of the network and a downward projection that are comprised entirely of seventeenth-century Qu&amp;eacute;b&amp;eacute;cois.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlWgVS9dgI/AAAAAAAAAFY/vm-nUMn48zQ/s1600-h/ncd-graph-06.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlWgVS9dgI/AAAAAAAAAFY/vm-nUMn48zQ/s200/ncd-graph-06.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285350751168460290" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlWpLz0JaI/AAAAAAAAAFg/Lx1gMF5oz9I/s1600-h/ncd-graph-05.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 150px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlWpLz0JaI/AAAAAAAAAFg/Lx1gMF5oz9I/s200/ncd-graph-05.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285350903240730018" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;One benefit of this method is the incredible speed with which it executes.  It took only seconds to calculate 4,950 NCDs; clustering the entire &lt;span style="font-style:italic;"&gt;DCB&lt;/span&gt; would require computing 35,427,153 distance measures and would take less than a day to run on my inexpensive home computer.  I'll save that hack for another time.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/data+compression" rel="tag"&gt;data compression&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/dictionary+of+canadian+biography" rel="tag"&gt;Dictionary of Canadian Biography&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/kolmogorov+complexity" rel="tag"&gt;Kolmogorov complexity&lt;/a&gt; | &lt;a href="http://technorati.com/tag/visualization" rel="tag"&gt;visualization&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-4821836714501621616?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4821836714501621616'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/4821836714501621616'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/06/clustering-with-compression.html' title='Clustering with Compression'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlV1540RGI/AAAAAAAAAEw/20KkrTkezFI/s72-c/ncd-graph-00.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1892107480225023596</id><published>2007-06-19T09:14:00.000-04:00</published><updated>2007-06-19T09:32:56.462-04:00</updated><title type='text'>Hope, the New Research Strategy</title><content type='html'>In February of this year, I realized that I needed to read an article from an obscure but relatively important scientific journal published in 1797.  Following the advice that Thomas Mann gives in his excellent &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Oxford-Guide-Library-Research/dp/0195189981/"&gt;Oxford Guide to Library Research&lt;/a&gt;&lt;/span&gt;, I started by assuming that what I was looking for must exist.  I spent a couple of hours looking for an online copy, using many of the techniques that I've described in this blog, and a few I haven't gotten around to describing.  No luck.&lt;br /&gt;&lt;br /&gt;At that point, of course, I might have found a copy in one of the libraries or archives in the greater Toronto area and gone there to read it.  Instead, I decided to try something completely different.  I made a note of what I wanted and recorded the details of my inconclusive search.  Then I said to myself, "boy I sure hope that Google digitizes the journal soon and makes a full copy available on Google Books."  At that point there was nothing left to do but wait.&lt;br /&gt;&lt;br /&gt;Today, again, I realized that I really should read that article.  When I typed the journal title into Google Books, there it was, waiting for me to download.  Now, I don't recommend this strategy to anyone working on, say, their dissertation.  Maybe it was a complete fluke.  I suspect, however, that it is weak measure of the seismic shifts that our landscape of &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/04/information-costs.html"&gt;information costs&lt;/a&gt; is currently undergoing.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/Google" rel="tag"&gt;Google&lt;/a&gt; | &lt;a href="http://technorati.com/tag/information+costs" rel="tag"&gt;information costs&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1892107480225023596?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1892107480225023596'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1892107480225023596'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/06/hope-new-research-strategy.html' title='Hope, the New Research Strategy'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-6477555199373152868</id><published>2007-06-18T16:54:00.000-04:00</published><updated>2007-06-18T18:02:21.904-04:00</updated><title type='text'>Seeing There</title><content type='html'>In his 1991 book &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Mirror-Worlds-Software-Universe-Shoebox-How/dp/019507906X/"&gt;Mirror Worlds&lt;/a&gt;&lt;/span&gt;, David Gelernter argued that we should be trying to engineer systems that help us "to achieve something that is so universally important and yet so hard to come by that it doesn't even have a word to describe it."  Gelernter called it topsight. "If &lt;span style="font-style:italic;"&gt;insight&lt;/span&gt; is the illumination to be achieved by penetrating inner depths," he wrote, "&lt;span style="font-style:italic;"&gt;topsight&lt;/span&gt; is what comes from a far-overhead vantagepoint, from a bird's eye view that reveals &lt;span style="font-style:italic;"&gt;the whole&lt;/span&gt;--the big picture; how the parts fit together.  ('Overview' comes fairly close to what I mean.  But an 'overview' is something you either have or you don't.  &lt;span style="font-style:italic;"&gt;Topsight&lt;/span&gt; is something that, like insight, you pursue avidly and continuously, and achieve gradually)" (pp. 52-53). One route to topsight that Gelernter proposed was to build microcosmic "mirror worlds" that would faithfully reflect live data collected continuously from the world, and summarized to create levels of abstraction.&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;The Mirror World is directly accessible, twenty-four hours a day, to the population that it tracks. You can parachute in your own software agents. They look out for your interests, or gather data that you need, or let you know when something significant seems to be going on. You consult the Mirror World like an encyclopedia when you need information; you read it like a dashboard when you need a fast take on current status (p.6).&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;When I first read &lt;span style="font-style:italic;"&gt;Mirror Worlds&lt;/span&gt; sixteen years ago, this seemed like science fiction, a beautiful dream.  Now it seems like business as usual.  Using my GPS-enabled &lt;a href="http://www.mio-tech.be/en/gps-navigation-device-Mio-A701-overview.htm"&gt;phone&lt;/a&gt; and &lt;a href="http://www.google.com/gmm/index.html"&gt;Google Maps mobile&lt;/a&gt; I'm able to see an aerial view of my position and route.  I like to walk whenever I can, and often study &lt;a href="http://www.weatheroffice.gc.ca/radar/index_e.html?id=WSO"&gt;radar imagery&lt;/a&gt; to look for gaps in rain or snowfall.  If I'm stuck on the highway, I can  use the phone to look ahead through a series of &lt;a href="http://www.mto.gov.on.ca/english/traveller/compass/camera/camhome.htm"&gt;traffic cameras&lt;/a&gt;.  Google has recently upped the ante by adding an incredible amount of data to Google Earth and by rolling out their new &lt;a href="http://maps.google.com/help/maps/streetview/"&gt;Street View&lt;/a&gt; for select cities.  It's become natural to treat the internet as, in William J. Mitchell's colorful phrase, "a worldwide, time-zone-spanning optic nerve with electronic eyeballs at its endpoints" (&lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/City-Bits-Space-Place-Infobahn/dp/0262631768/"&gt;City of Bits&lt;/a&gt;&lt;/span&gt;, 31).&lt;br /&gt;&lt;br /&gt;The widespread digitization of historical sources raises the question of what kinds of top-level views we can have into the past.  Obviously it's possible to visit an archive in real life or in Second Life, and easy to imagine locating the archive in Google Earth.  It is also possible to geocode sources, link each to the places to which it relates or refers.  Some of this will be done manually and accurately, some automatically with a lower degree of accuracy.  Augmenting places with sources, however, raises new questions about selectivity.  Without some way of filtering or making sense of these place-based records, what we'll end up with at best will be an overview, and not topsight.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/geocoding" rel="tag"&gt;geocoding&lt;/a&gt; | &lt;a href="http://technorati.com/tag/microcosm" rel="tag"&gt;microcosm&lt;/a&gt; | &lt;a href="http://technorati.com/tag/place" rel="tag"&gt;place&lt;/a&gt; | &lt;a href="http://technorati.com/tag/topsight" rel="tag"&gt;topsight&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-6477555199373152868?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6477555199373152868'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/6477555199373152868'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/06/seeing-there.html' title='Seeing There'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7344184598744242158</id><published>2007-06-09T11:23:00.000-04:00</published><updated>2007-06-09T12:15:22.553-04:00</updated><title type='text'>History Appliances: The Soundscape</title><content type='html'>I recently had a chance to visit my friends at the &lt;a href="http://chnm.gmu.edu/"&gt;Center for History and New Media&lt;/a&gt; and give a brown bag talk about &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/03/coming-soon-history-appliances.html"&gt;history appliances&lt;/a&gt;.  I outlined my optimistic version of the idea and Rob MacDougall's more pessimistic (and probably more realistic) &lt;a href="http://www.robmacdougall.org/index.php/2007/04/history-and-appliances-1/"&gt;version&lt;/a&gt;.  For people who want to get started building their own history appliances, I discussed some of the &lt;a href="http://www.amazon.com/Physical-Computing-Sensing-Controlling-Computers/dp/159200346X/"&gt;wetware&lt;/a&gt;, &lt;a href="http://www.cycling74.com/"&gt;software&lt;/a&gt; and &lt;a href="http://www.phidgets.com/"&gt;hardware&lt;/a&gt; that might be useful.  While I was there, I also chatted with &lt;a href="http://www.dancohen.org/"&gt;Dan&lt;/a&gt;, &lt;a href="http://edwired.org/"&gt;Mills&lt;/a&gt; and &lt;a href="http://www.foundhistory.org/"&gt;Tom&lt;/a&gt; for an episode in their excellent &lt;a href="http://digitalcampus.tv/2007/05/30/episode-07-history-appliances/"&gt;Digital Campus podcast&lt;/a&gt; series.  Unfortunately I had a bad cold, so even their audio tech expert couldn't make me sound more like Barry White than Barry Gibb.&lt;br /&gt;&lt;br /&gt;The best part of the brown bag talk was that I was able to break in the middle for a brainstorming session with the CHNM audience, where they came up with a bunch of great new ideas.  &lt;a href="http://historytalk.typepad.com/"&gt;Paula&lt;/a&gt;, for example, suggested that it might be possible to create a tunable soundscape.  Set the dial for 1873 and you might hear the sounds of horse hooves and wagon wheels on cobblestones, church bells or the cries of street vendors.  Such an appliance would be truly ambient, and could be based on the kinds of work being done by historians like &lt;a href="http://www.amazon.com/Soundscape-Modernity-Architectural-Acoustics-Listening/dp/0262701065/"&gt;Emily Thompson&lt;/a&gt; or &lt;a href="http://www.amazon.com/Early-America-Sounded-Richard-Cullen/dp/0801441269/"&gt;Richard Cullen Rath&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Our discussion of the tunable soundscape quickly veered into questions of uncertainty (would we really know what the component sounds were?), veracity (how would we know if we got it right?) and lived experience (what did it sound like to the people of the time?)  In the limit we would be faced with the problem raised by the philosopher Thomas Nagel in his famous &lt;a href="http://members.aol.com/NeoNoetics/Nagel_Bat.html"&gt;essay&lt;/a&gt; "What is it like to be a bat?"  Sure, we might make a machine that could convey &lt;span style="font-style:italic;"&gt;exactly&lt;/span&gt; what it is like for one of us to be a bat, but it still wouldn't tell us what it is like for a bat to be a bat.&lt;br /&gt;&lt;br /&gt;For me that raises one of the key benefits of the history appliance idea.  We should approach the building of history appliances as a form of critical, reflective practice ... we make things, we design interactions, to give us new routes into the questions that historians have always struggled with.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/lived+experience" rel="tag"&gt;lived experience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/max+msp+jitter" rel="tag"&gt;Max/MSP/Jitter&lt;/a&gt; | &lt;a href="http://technorati.com/tag/phidgets" rel="tag"&gt;phidgets&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7344184598744242158?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7344184598744242158'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7344184598744242158'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/06/history-appliances-soundscape.html' title='History Appliances: The Soundscape'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-8776732206510380540</id><published>2007-05-22T09:21:00.000-04:00</published><updated>2007-05-22T16:29:56.822-04:00</updated><title type='text'>What It's About 3: Interaction</title><content type='html'>Although digital history isn't about computers per se, we still have to take into account our mediated interactions with other people, and with various constellations of hardware and software.  Over the past few decades we've seen the widespread proliferation of human-scale interactive devices.  These have been driven, in part, by advances in electronics.  Thanks to transistors and integrated circuits, small electric motors, tiny radio transceivers, LEDs, lasers and relatively long-lasting batteries, more and more people are toting around cell phones, pagers, laptops, digital cameras, music players, and GPS receivers.  As any reader of &lt;a href="http://www.gizmodo.com/"&gt;Gizmodo&lt;/a&gt; or &lt;a href="http://www.engadget.com/"&gt;Engadget&lt;/a&gt; knows, these devices are legion.  Thanks to various standards like &lt;a href="http://computer.howstuffworks.com/wireless-network.htm"&gt;Wifi&lt;/a&gt; and &lt;a href="http://electronics.howstuffworks.com/bluetooth.htm"&gt;Bluetooth&lt;/a&gt;, they are usually networked to one another and to the Internet.&lt;br /&gt;&lt;br /&gt;In the same period there has been a rethinking and expansion of a field which used to be known as human-computer interaction, and is now known as interaction design.  Interaction designers are responsible for making it easier, more obvious or more intuitive to place a call, post to a blog, get money from an ATM, buy tunes, order an espresso, shift into overdrive or pay taxes.  For many years people approached computers at a level very close to the machine, flipping switches or punching out ones and zeros on cards.  As networked computation becomes ever more pervasive in our environments it takes many more forms.  It may be invisible, like the network of microprocessors that &lt;a href="http://electronics.howstuffworks.com/car-computer.htm"&gt;keep your car running efficiently&lt;/a&gt; or &lt;a href="http://www.detnews.com/2005/autosinsider/0510/03/A01-335316.htm"&gt;provide telltale data&lt;/a&gt; after an 'event.'  It may seem like something else: a phone conversation, a game, recorded music, even a &lt;a href="http://www.americanheart.org/presenter.jhtml?identifier=4676"&gt;heartbeat&lt;/a&gt;.  New devices allow people to use &lt;a href="http://us.wii.com/experience_gallery.jsp"&gt;motion and gesture&lt;/a&gt; as inputs.  Many interaction designers follow the advice of IDEO's &lt;a href="http://www.designinginteractions.com/"&gt;Bill Moggridge&lt;/a&gt; to think in terms of "verbs, not nouns."&lt;br /&gt;&lt;br /&gt;As an example of the potential of thinking in verbs, consider reading.  Historians are very familiar with the different &lt;a href="http://www.jnd.org/dn.mss/affordances_and.html"&gt;affordances&lt;/a&gt; of the traditional codex.  In digital form, however, text can be "remixed" or "mashed-up" in ways that allow new kinds of interaction.  In a &lt;a href="http://gutenkarte.org/place/7142/13262"&gt;sample mashup&lt;/a&gt;, the text of Thucydides' &lt;span style="font-style:italic;"&gt;History of the Peloponnesian War&lt;/span&gt; was passed through a system that extracts geographic names and projected into an interface that includes an interactive map.  In this new form, the book can still be read in the traditional fashion; it is now also possible to click on locations on the map and see corresponding passages in the text.  It is relatively easy to extract dates and plot them on an interactive timeline, providing a temporal browser as well as a spatial one.  If the editor of the volume needs to correct an error in the text, the mashup continues to work.  If the &lt;span style="font-style:italic;"&gt;History&lt;/span&gt; of Herodotus is digitized, it can be given a similar interface with little effort.  There are three key points about mashups.  First, they are heterogenous, built from services that are supplied over the Internet.  Once one person or group figures out how to do something (like extract dates from a text) they can provide the service to anyone else who needs it.  There's no need to reinvent the wheel.  Second, mashups are live.  Data can be continuously updated without breaking the system.  Third, the range of web services is continuously expanding.  Increasingly, we will come to think of the work of the historian as the work of drawing live sources from archives, integrating those sources on the fly, interpreting them, and building that interpretation into tools that give the reader an unprecedented power to explore the evidentiary base from which our accounts are constructed.  Historians, in other words, will become designers of experiences and interactions.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8776732206510380540?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8776732206510380540'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8776732206510380540'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/05/what-its-about-3-interaction.html' title='What It&apos;s About 3: Interaction'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3335029462256917411</id><published>2007-05-18T10:01:00.000-04:00</published><updated>2007-05-18T11:22:42.448-04:00</updated><title type='text'>What It's About 2: More is Different</title><content type='html'>In 1972, the physicist P. W. Anderson published an essay titled "More is Different," arguing against the idea of scientific reductionism (&lt;span style="font-style:italic;"&gt;Science&lt;/span&gt; 177, no. 4047).&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;The main fallacy in this kind of thinking is that the reductionist hypothesis does not by any means imply a 'constructionist' one: The ability to reduce everything to simple fundamental laws does not imply the ability to start from those laws and reconstruct the universe. ... The constructionist hypothesis breaks down when confronted with the twin difficulties of scale and complexity. The behavior of large and complex aggregates of elementary particles, it turns out, is not to be understood in terms of a simple extrapolation of the properties of a few particles.  Instead, at each level of complexity entirely new properties appear, and the understanding of the new behaviors requires research which I think is as fundamental in its nature as any other.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;This is the second sense in which digital history is not about computers.  Taken by itself (or in conjunction with a user and stand-alone application software), the properties of a single computer tell us almost nothing about the properties and possibilities of densely interconnected networks of people, machines and software.  Researchers in &lt;a href="http://dmoz.org/Computers/Artificial_Life/"&gt;artificial life&lt;/a&gt; and related fields have shown that routine interactions among simple, identical agents can result in complex and unpredictable swarm dynamics.  The heterogeneous assemblages that provide a context for digital history are far richer, perhaps best captured with the metaphor of &lt;a href="http://www.amazon.com/Information-Ecologies-Using-Technology-Heart/dp/0262640422/"&gt;information ecologies&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Having &lt;span style="font-weight:bold;"&gt;more&lt;/span&gt; gives you completely different capabilities.  Take the example of Amazon's database of customer information.  When you look at an item, the system can provide you with pointers to related items: "people who looked at / bought this also looked at / bought x, y, z."  As &lt;span style="font-style:italic;"&gt;you&lt;/span&gt; look at and buy items, the system becomes smarter.  Idiosyncracies of individual browsing or purchasing are ironed out as more data are collected.  Over time, new associations may develop and old ones disappear, providing insight into historical trends.  The Amazon book database is already so powerful that no humanist can afford to ignore it, but the strength of this approach is not limited to commercial applications.  Suppose that scholarly books and articles were indexed in a similar way: "people who looked at / cited this also looked at / cited x, y, z."  It becomes trivial to do kinds of literature review that are normally very difficult.  One of these is to assess the downstream impact of a given work: which works cite this one?  Another is to find related but isolated research groups, who may be citing an overlapping literature but are apparently unaware of one another's work.&lt;br /&gt;&lt;br /&gt;Having &lt;span style="font-weight:bold;"&gt;more&lt;/span&gt; changes our ideas of what history and memory are.  Roy Rosenzweig's essay on &lt;a href="http://www.historycooperative.org/journals/ahr/108.3/rosenzweig.html"&gt;scarcity and abundance&lt;/a&gt; should be required reading for all historians.  I've already written about &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/04/information-costs.html"&gt;information costs&lt;/a&gt;, so I won't go into detail here, except to say that historical projects have largely been defined by what we can't find or know, and that's about to change.  Having nearly frictionless access to vast amounts of source material makes it possible to undertake projects that hinge on attested, but very-low-frequency evidence.  Having more of everything also means that attention becomes a scarce resource.  As scholars, our reputations and careers are increasingly shaped by the logic of the gift.&lt;br /&gt;&lt;br /&gt;Finally, &lt;span style="font-weight:bold;"&gt;more&lt;/span&gt; is about to become &lt;span style="font-weight:bold;"&gt;an awful lot more&lt;/span&gt;.  Technologies like &lt;a href="http://www.privcom.gc.ca/fs-fi/02_05_d_28_e.asp"&gt;RFID&lt;/a&gt; and &lt;a href="http://www.memsnet.org/mems/what-is.html"&gt;MEMS&lt;/a&gt; make it possible to create vast sensor networks that continuously record data in unimaginable quantities, or that can track the history of practically any object of interest.  &lt;a href="http://mase.itc.nagoya-u.ac.jp/CARPE2006/"&gt;CARPE&lt;/a&gt; researchers study the capture, archival and retrieval of personal experiences across a lifetime.  If you thought &lt;a href="http://www.pepysdiary.com/"&gt;Samuel Pepys&lt;/a&gt; left a lot of material, you haven't seen anything yet.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3335029462256917411?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3335029462256917411'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3335029462256917411'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/05/what-its-about-2-more-is-different.html' title='What It&apos;s About 2: More is Different'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-2710832587319090613</id><published>2007-05-07T16:02:00.000-04:00</published><updated>2007-05-07T16:56:46.249-04:00</updated><title type='text'>What It's About 1: Links and Bias</title><content type='html'>In my last post I suggested that digital history isn't about computers, although it may have seemed more reasonable to think so in the mid-1980s.  In fact, around that time the influential computer scientist Edsger Dijkstra made the &lt;a href="http://www.cs.utexas.edu/users/EWD/transcriptions/EWD09xx/EWD924.html"&gt;provocative argument&lt;/a&gt; that even &lt;span style="font-weight:bold;"&gt;computer science&lt;/span&gt; isn't about computers.  Referring to the subject as computer science, he wrote, "is like referring to surgery as 'knife science'..."&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;We now know that electronic technology has no more to contribute to computing than the physical equipments. We now know that programmable computer is no more and no less than an extremely handy device for realizing any conceivable mechanism without changing a single wire, and that the core challenge for computing science is hence a conceptual one, viz. what (abstract) mechanisms we can conceive without getting lost in the complexities of our own making. ... This discipline, which became known as Computing Science, emerged only when people started to look for what be common to the use of any computer in any application. By this abstraction, computing science immediately and clearly divorced itself from electronic engineering: the computing scientist could not care less about the specific technology that might be used to realize machines, be it electronics, optics, pneumatics, or magic.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;To some extent, digital history inherits this indifference to underlying mechanism.  We're better off to focus our attention instead on what the technology allows us to do.&lt;br /&gt;&lt;br /&gt;Links, it is said, are the currency of the web.  They make it possible to navigate from one context to another with a single click.  For human users, this greatly lowers the transaction costs of juxtaposing two representations.  The link is abstract enough to serve as means of navigation and able to subsume traditional scholarly activities like footnoting, citation, glossing and so on.  Furthermore, extensive hyperlinking allows readers to follow nonlinear and branching paths through texts.  So much is well known to humanists.  Fewer seem to realize that links are constantly being navigated by a host of artificial users, colorfully known as spiders, bots or crawlers.  A computer program downloads a webpage, extracts all of the links on it, and follows each in turn, downloading the new pages that it encounters along the way.  Using tools like this, students of the internet can map the topology of subnetworks.  Some pages serve as hubs, with millions of inbound links.  Some are bridges that connect two network regions that are otherwise very sparsely interconnected.  Done ceaselessly on a large enough scale, a dynamic and partial map of the internet emerges from spidering, and this serves as the basis for search engines.  &lt;br /&gt;&lt;br /&gt;Stop for a moment and think about search engines.  Google handles more than ninety million &lt;a href="http://searchenginewatch.com/showPage.html?page=2156461"&gt;search requests&lt;/a&gt; per day.  For the vast majority of those searches, there will be far too many hits for the user to look at more than a tiny fraction of the results.  Instead, he or she will concentrate on the top 10 or 20 hits.  Google (and a few other companies like Yahoo! and MSN) are introducing biases into research results by ranking the hits.  That's unavoidable, and historians, at least, take bias for granted.  It is something to be thought about, not something that can be eliminated.  I would argue, however, that search engine result ranking is the single most pervasive form of bias that has &lt;span style="font-style:italic;"&gt;ever&lt;/span&gt; existed. When Google says that their &lt;a href="http://www.google.com/corporate/"&gt;mission&lt;/a&gt; "is to organize the world's information and make it universally accessible and useful," they're not kidding.  Do you know how search engines work?  Can you afford not to?&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-2710832587319090613?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2710832587319090613'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/2710832587319090613'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/05/what-its-about-1-links-and-bias.html' title='What It&apos;s About 1: Links and Bias'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1887742200842861302</id><published>2007-05-05T07:01:00.000-04:00</published><updated>2007-05-05T07:59:54.386-04:00</updated><title type='text'>It's Not About Computers</title><content type='html'>On January 3, 1983, &lt;span style="font-style:italic;"&gt;Time&lt;/span&gt; magazine declared that the 1982 "man of the year" was actually a machine: &lt;a href="http://www.time.com/time/magazine/0,9263,7601830103,00.html"&gt;the computer&lt;/a&gt;.  "There's a new world coming again," Roger Rosenblatt wrote, "looming on the desktop." A series of articles provided a thumbnail history of computing, described different brands of hardware, predicted huge impact and "awesome" sales figures, introduced people like Jobs and Wozniak and walked through a simple programming example.  There was even a glossary for "gweeps."  (According to &lt;span style="font-style:italic;"&gt;Time&lt;/span&gt;, a "gweep" was a hacker suffering from overwork.  With 47,000 hits on Google today, the word is encountered just a bit more frequently than "absquatulate.")  "All clear?" Otto Friedrich asked, "Those who think so are called 'computer literate,' which is synonymous with young, intelligent and employable; everybody else is the opposite."&lt;br /&gt;&lt;br /&gt;1983 was probably a good year to start thinking about introducing personal computers into university coursework.  Many people had been using them for years already, and it was clear that they would play a very important role in the decades to follow.  Some historians and history educators were already there.  Joanne Francis published &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.worldcat.org/oclc/12448617&amp;referer=brief_results"&gt;Microcomputers and Teaching History&lt;/a&gt;&lt;/span&gt; in 1983.  Richard J. Jensen's &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.worldcat.org/oclc/62983401&amp;referer=brief_results"&gt;Microcomputer Revolution for Historians&lt;/a&gt;&lt;/span&gt; came out the following year, as did Roy Rosenzweig's &lt;a href="http://www.worldcat.org/oclc/93128849&amp;referer=brief_results"&gt;article&lt;/a&gt; on using databases for oral history.  Teaching history students how to use computers was a really good idea in the early 1980s.&lt;br /&gt;&lt;br /&gt;It's not anymore. Students who were born in 1983 have already graduated from college. If they didn't pick up the rudiments of word processing and spreadsheet and database use along the way, that's tragic.  But if we concentrate on teaching those things now, we'll be preparing our students for the brave new world of 1983.  &lt;br /&gt;&lt;br /&gt;So if digital history isn't about computers, what &lt;span style="font-weight:bold;"&gt;is&lt;/span&gt; it about?  Stay tuned.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1887742200842861302?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1887742200842861302'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1887742200842861302'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/05/its-not-about-computers.html' title='It&apos;s Not About Computers'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-985855782701714318</id><published>2007-04-22T12:26:00.000-04:00</published><updated>2007-04-22T13:58:48.606-04:00</updated><title type='text'>The Trouble with Modernity</title><content type='html'>When I first started designing with &lt;a href="http://technic.lego.com/"&gt;LEGO Technic&lt;/a&gt; in the mid-1980s, I went out and bought a number of &lt;a href="http://www.planomolding.com/"&gt;fancy tackle boxes&lt;/a&gt; to hold all of the different pieces.  I would make something, use it for a while, and then break it down into its &lt;a href="http://designinsite.dk/htmsider/m0007.htm"&gt;ABS&lt;/a&gt; atoms, putting each beam, brick, axle, gear, wheel, pin and plate into its own little compartment.  It's a very modern impulse, to see things in terms of abstract, discrete, separable units which can be arrayed into a grid.&lt;br /&gt;&lt;br /&gt;Later, when I was working on my PhD, Deborah Fitzgerald suggested that I read Douglas Harper's ethnography of technology &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Working-Knowledge-Skill-Community-Small/dp/0520079701/"&gt;Working Knowledge&lt;/a&gt;&lt;/span&gt;.  It's really a great book, a study of a craftsman named Willie who runs a small shop in northern New York.  When I came across this passage I had a moment of enlightenment:&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;For Willie "junk" cars are a storehouse of parts.  It is not only the major components, such as engines and transmissions, that are valuable when a car is junked.  The small parts, even brackets or fasteners, also have value far beyond their simple monetary worth.  This is especially true for a mechanic like Willie who works almost exclusively on a single make of car.  Twenty or thirty miscellaneous junked cars sitting outside a general mechanic's shop would be hard to use efficiently.  Twenty to thirty old Saabs, however, constitute a cache of parts for a rather esoteric automobile.  Saab parts are hard to find and expensive.  Because the cars Willie fixes are often seven to fifteen years old, the parts needed may be obsolete or unavailable -- even just a bolt with a specific length and thread.  But the part is "catalogued" on a car sitting outside, ready to be used. (pp.152,154)&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Up until that moment, I had always seen car junkyards as I imagine most people see them: messy, anti-modern even.  The idea that every piece of a (partially) assembled car serves as a context for understanding and finding the rest of the pieces had never occurred to me.&lt;br /&gt;&lt;br /&gt;Twenty years on, there are more and &lt;a href="http://mindstorms.lego.com/"&gt;cooler LEGO parts&lt;/a&gt; to hack than ever before.  Now, however, I tend to keep my previous creations as long as possible, disassembling them only as necessary.  Knowing what I did before helps me to remember good &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/03/idioms.html"&gt;idioms&lt;/a&gt; and prevents me from reusing bad ones.  Having multiple prototypes in draft also lets me combine the best ideas of the bunch, or can provide a tangible signal when it is time to try a fresh approach.  It's possible that this kind of messiness even helps us to be more creative.  In &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Natural-Born-Cyborgs-Technologies-Future-Intelligence/dp/0195177517/"&gt;Natural-Born Cyborgs&lt;/a&gt;&lt;/span&gt;, Andy Clark describes studies that suggest that our ability to imagine or visualize things is constrained in ways that our perceptual abilities may not be.  Describing our ability to understand visual forms with multiple interpretations he writes&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;Given the evident constraints on our ability to find new interpretations using mental imagery alone, it is not surprising that the discovery of such multiple interpretable forms turns out to depend heavily on a kind of looping process.  In this looping process the artist first sketches and then perceptually, not merely imaginatively, re-encounters visual forms, which she can then inspect, tweak, and re-sketch so as to create a final product that supports a densely multilayered set of structural interpretations.  The fossil trail of this process remains visible in the sequence of sketches themselves. (pp.76-77)&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;Fossil trail, indeed.  Designing from scratch is like trying to start every evolutionary process with atoms.  You get a lot more variety if you can mix and match at successive levels of complexity (for more on this, see John Maynard Smith and Eors Szathmary, &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Major-Transitions-Evolution-Maynard-Smith/dp/019850294X/"&gt;The Major Transitions in Evolution&lt;/a&gt;&lt;/span&gt;.)  The modern impulse toward analysis and organization is a powerful tool, but we can't let the aesthetic get in our way.&lt;br /&gt;&lt;br /&gt;In Web 2.0 applications, there has been a trend toward &lt;a href="http://www.anildash.com/magazine/2002/11/introducing_the.html"&gt;microcontent&lt;/a&gt;.   On the plus side, this has led to easy remixing and the wonderful variety of mashups. We have to be careful, however, to maintain and index the context in which these chunks are used and reused.  One of the fundamental principles of archival practice is respect for the &lt;a href="http://www.archivists.org/glossary/term_details.asp?DefinitionKey=69"&gt;original order&lt;/a&gt; of records.  In a sense, this is akin to the practice of keeping mostly whole cars in junkyards.  It may not appear to be the most "logical" way of preserving information, but in the long run it may turn out to be the must useful way.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/analysis+synthesis" rel="tag"&gt;analysis and synthesis&lt;/a&gt; | &lt;a href="http://technorati.com/tag/bricolage" rel="tag"&gt;bricolage&lt;/a&gt; | &lt;a href="http://technorati.com/tag/context" rel="tag"&gt;context&lt;/a&gt; | &lt;a href="http://technorati.com/tag/findability" rel="tag"&gt;findability&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/lego" rel="tag"&gt;LEGO&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-985855782701714318?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/985855782701714318'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/985855782701714318'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/04/trouble-with-modernity.html' title='The Trouble with Modernity'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1186649027713219611</id><published>2007-04-17T15:37:00.000-04:00</published><updated>2007-04-17T17:03:24.530-04:00</updated><title type='text'>Luddism Is a Luxury You Can't Afford</title><content type='html'>Anyone who works in digital humanities encounters self-proclaimed Luddites from time to time.  I have to admit that these people used to annoy me a lot, but I've recently discovered that a sweeping dismissal of technology can make a nice jumping-off point for a more nuanced discussion.  I now start by asking what kinds of technology people particularly dislike.  Often the answer is that they don't like gadgets, especially ones that they can't figure out how to use.  I agree that there are plenty of irritating and pointless gizmos in the world.  &lt;br /&gt;&lt;br /&gt;(Figuring out how to use these things may be their most interesting &lt;a href="http://www.jnd.org/dn.mss/affordances_and.html"&gt;affordance&lt;/a&gt;.  In the early 1980s, I once spent about 36 hours cracking the copy protection on a friend's computer game.  The puzzle posed by the game turned out to be so inferior to the one posed by its security that I deleted the game immediately.)&lt;br /&gt;&lt;br /&gt;I like to follow up by checking the depth of my interlocutor's commitment to Luddism.  Are they willing to do without electricity?  Anesthetics?  Running water?  Architecture?  Clothing?  How about literacy?  I have yet to meet someone who knows the word "Luddite" who would actually be willing to give up the ability to read.&lt;br /&gt;&lt;br /&gt;It turns out that what's really interesting about latter day Luddism is that it teaches you a lot about the visibility of particular technologies, and by extension, about the place of the human mind in the world.  Someone who can sip a double tall non-fat latte while decrying technology isn't really a hypocrite.  They just don't see how the drink in their hand is articulated with global flows of material, energy and information.  In &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Being-Time-Martin-Heidegger/dp/0060638508/"&gt;Being and Time&lt;/a&gt;&lt;/span&gt;, Heidegger famously distinguished between things being ready-to-hand and present-at-hand.  While you're sipping out of a cup it is ready-to-hand.  You don't notice it unless something goes wrong.  If the seam melts and drops scalding coffee in your lap, the cup and the coffee become present-at-hand.  Instead of being part of you, part of the untrammeled experience of drinking, they are now perceived as external, something you have to deal with.  When people claim be Luddites, I think that they are really objecting to the experience of a class of things that always seem to be present-at-hand.  It's hard not to like the things that have already become ready-to-hand.  These invisible and pervasive technologies are exactly the ones that humanists should be thinking about, though, because they have the deepest implications for who and what we are.  In &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Natural-Born-Cyborgs-Technologies-Future-Intelligence/dp/0195177517/"&gt;Natural-Born Cyborgs&lt;/a&gt;&lt;/span&gt;, Andy Clark writes that "what is special about human brains, and what best explains the distinctive features of human intelligence, is precisely their ability to enter into deep and complex relationships with nonbiological constructs, props, and aids.  This ability, however, does not depend on physical wire-and-implant mergers, so much as on our openness to information-processing mergers."  "Tools-R-Us," he says, "and always have been." (5,7)&lt;br /&gt;&lt;br /&gt;So enjoy the latte, the iPod, the microfiber clothing.  By all means, cycle or take a train instead of driving a car.  If you're really concerned about technology, however, remember that it has the most potential to be dangerous when you stop seeing it.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/affordances" rel="tag"&gt;affordances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/present+at+hand" rel="tag"&gt;present-at-hand&lt;/a&gt; | &lt;a href="http://technorati.com/tag/ready+to+hand" rel="tag"&gt;ready-to-hand&lt;/a&gt; | &lt;a href="http://technorati.com/tag/technology" rel="tag"&gt;technology&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1186649027713219611?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1186649027713219611'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1186649027713219611'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/04/luddism-is-luxury-you-cant-afford.html' title='Luddism Is a Luxury You Can&apos;t Afford'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3348662119856861859</id><published>2007-04-16T18:13:00.000-04:00</published><updated>2007-04-16T18:38:03.656-04:00</updated><title type='text'>Hacking the Hacks</title><content type='html'>I think one of my favorite things about having this blog is that people occasionally take one of my hacks and improve it.  This kind of collaborative &lt;a href="http://digitalhistoryhacks.blogspot.com/2006/05/blogging-andas-stepwise-refinement.html"&gt;stepwise refinement&lt;/a&gt; is, of course, a key argument for open source.  I enjoy learning about other people's hacks and often incorporate their suggestions into my own working versions.&lt;br /&gt;&lt;br /&gt;Last month, for example, Rob Nelson of the Technology Integration Program at William and Mary wrote some PHP code to create his own &lt;a href="http://tip.wm.edu/?p=172"&gt;field-at-a-glance page&lt;/a&gt; for antebellum America.  More recently, Matt Joyce greatly improved the &lt;a href="http://babbaging.blogspot.com/2007/04/python-amazon-graphs-oh-my.html"&gt;exploratory bibliography&lt;/a&gt; code and posted a tutorial at his new blog Babbaging.  Both Rob and Matt have nice clear explanations of what they did.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+source" rel="tag"&gt;open source&lt;/a&gt; | &lt;a href="http://technorati.com/tag/stepwise+refinement" rel="tag"&gt;stepwise refinement&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3348662119856861859?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3348662119856861859'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3348662119856861859'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/04/hacking-hacks.html' title='Hacking the Hacks'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-7990477099841161126</id><published>2007-04-11T19:11:00.000-04:00</published><updated>2007-04-11T20:01:14.750-04:00</updated><title type='text'>History Appliances: The Metronome</title><content type='html'>I recently started reading John Thackara's new book &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Bubble-Designing-Complex-World/dp/0262201577/"&gt;In the Bubble: Designing in a Complex World&lt;/a&gt;&lt;/span&gt;, where I came across this vivid depiction of the infinite archive &lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;On just one single day of the days I have spent writing this book, as much world trade was carried out as in the whole of 1949; as much scientific research was published as in the whole of 1960; as many telephone calls were made as in all of 1983; as many e-mails were sent as in 1990 (p.5)&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;The statement provides food for thought on a number of levels.  How long does it take for the volume of something to increase by a couple of orders of magnitude?  What roles do transaction and information costs play?  We might also think about converse situations: On a single day in year 18xx, more whalebone corsets were made than in the whole of 2005.  Statements of this form allow us to formulate relations between different time periods in quantitative terms.&lt;br /&gt;&lt;br /&gt;How might we convey such information about historic rates or volumes in a more tangible or peripheral way?  Suppose we wanted to get a feel for the increasing amount of e-mail exchanged as the nineties unfolded.  One possibility would be to hook up a stream of historical data to a &lt;a href="http://www.metronomeonline.com/"&gt;metronome&lt;/a&gt; driven by a &lt;a href="http://phidgets.com/index.php?module=pncommerce&amp;func=itemview&amp;KID=117633564574.112.160.80&amp;IID=16"&gt;servomotor&lt;/a&gt;.  As the years slowly scroll by on an odometer-like display, the tempo increases from largo, through adagio to &lt;a href="http://www.allmusic.com/cg/amg.dll?p=amg&amp;sql=77:2639"&gt;hardcore techno&lt;/a&gt;.  You probably have to turn it off at that point.  Not only will the sheer volume of e-mail cause the metronome to tear itself to pieces if you let it continue towards Y2K, but hardcore techno was, as AllMusic tells us, "practically ... a dinosaur by the end of the decade."&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/history+appliances" rel="tag"&gt;history appliances&lt;/a&gt; | &lt;a href="http://technorati.com/tag/information+costs" rel="tag"&gt;information costs&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/phidgets" rel="tag"&gt;phidgets&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-7990477099841161126?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7990477099841161126'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/7990477099841161126'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/04/history-appliances-metronome.html' title='History Appliances: The Metronome'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-9113207504719290677</id><published>2007-04-02T09:30:00.000-04:00</published><updated>2007-04-02T17:09:09.781-04:00</updated><title type='text'>The Alien's Ruler</title><content type='html'>Back in the days of Usenet, there was a &lt;a href="http://groups.google.com/group/rec.puzzles/browse_thread/thread/2fb8426d9da1fea0/"&gt;thread&lt;/a&gt; on rec.puzzles about an alien who comes to Earth, encodes the contents of the &lt;span style="font-style:italic;"&gt;Encyclopedia Britannica&lt;/span&gt; as a single mark on a metal rod and takes it back to his home planet.  The gist of the puzzle was that (1) it's possible to encode information as numbers, and (2) if you concatenate a bunch of numbers together and precede them with a zero and a decimal point, you have the decimal representation of a fraction.  Assuming the alien knows the length of the metal rod, each possible mark on it would correspond to some fraction. &lt;br /&gt;&lt;br /&gt;You probably remember learning that if you're given two different fractions, it's always possible to find another that lies between them.  Between three-fifths and four-fifths, for example, is seven-tenths ... not to mention an infinite number of close friends.  So what prevents alien or human technologists from using metal rods as arbitrarily dense analog storage devices?  Measurement.&lt;br /&gt;&lt;br /&gt;Seth Lloyd works through the physics in his enjoyable book &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Programming-Universe-Quantum-Computer-Scientist/dp/1400040922/"&gt;Programming the Universe&lt;/a&gt;&lt;/span&gt; (pp.22-24):&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;just as there are an infinite number of real numbers between 0 and 1, there are apparently an infinite number of possible lengths between zero meters and one meter.  The reason that apparently continuous quantities such as the length of a metal rod can register only a finite amount of information is that these quantities are typically defined only to a finite level of precision. To see the trade-off between precision and information, think of measuring the length of that rod using a meterstick.  The meterstick is made of wood.  One hundred centimeters are marked and numbered on the stick.  One thousand millimeters are marked, ten for each centimeter, but there is not enough room on the meterstick to number them legibly. You can use the meterstick to measure the length of the rod to the accuracy of about a millimeter.  Below a millimeter, a meterstick does not measure distances well, simply because its physical characteristics give it a finite resolution.  The total number of alternatives is 1,000, corresponding to three digits of accuracy, or about ten bits of information.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;(Think of a bit as a switch that can take two values, 0 and 1.  If you have two bits, you can store four alternatives: 00, 01, 10, 11.  If you have three bits you can store eight alternatives: 000, 001, 010, 011, 100, 101, 110, 111.  With n bits, you can store 2^n alternatives.  Since 2^9=512 and 2^10=1024, you need about ten bits to store 1,000 alternatives.)&lt;br /&gt;&lt;br /&gt;Presumably aliens have better technology than wooden metersticks.  But Lloyd works through a series of measurement devices, showing that each one doesn't buy that much more capability.  With an optical microscope or an interferometer you might get six digits of accuracy (about 20 bits).  With an atomic force microscope you might get ten digits of accuracy (33 bits)... but that requires the ability to sense individual atoms in the metal rod.&lt;br /&gt;&lt;blockquote&gt;&lt;br /&gt;To get thirty-three bits of information about the length of our rod, we have to count that length in atoms: heroic amounts of effort are typically required to wring more than a few tens of bits of information out of a single continuous quantity such as the length of a rod.  By contrast, if we use many individual quantities to register information, we can rapidly accumulate many bits. ... Our rod contains something like a billion billion billion atoms.  If each one registers a bit, then the atoms in the rod can register a billion billion billion bits, far more than the length of the rod on its own can register.  In general, the best way to get more information is not to increase the precision of measurements on a continuous quantity, but rather to put together measurements on more and more quantities, each of which may register only a few bits.  This compiling of bits--or digital representation--is effective because the number of total alternatives described grows much faster than does the number of bits.&lt;br /&gt;&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;I bring up this thought experiment because I think it highlights what is interesting about digital history.  Digital history isn't just about historical sources stored and manipulated on computers.  That's pretty old hat, dating at least to &lt;a href="http://nora.lis.uiuc.edu/xtf/view?docId=blackwell/9781405103213/9781405103213.xml&amp;chunk.id=ss1-1-2&amp;toc.depth=1&amp;toc.id=ss1-1-2&amp;brand=default"&gt;Father Busa's work&lt;/a&gt; with automated concordancing in the 1940s.  And digital history isn't just about historical sources represented in digital form, which dates back &lt;a href="http://cdli.ucla.edu/"&gt;millennia&lt;/a&gt;.  Instead the "digital" of digital history points us toward sites where the digital or analog representation of past events--or the conversion from one form to the other--plays a role in historical consciousness.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/analog" rel="tag"&gt;analog&lt;/a&gt; | &lt;a href="http://technorati.com/tag/bits" rel="tag"&gt;bits&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digitization" rel="tag"&gt;digitization&lt;/a&gt; | &lt;a href="http://technorati.com/tag/information+theory" rel="tag"&gt;information theory&lt;/a&gt; | &lt;a href="http://technorati.com/tag/representation" rel="tag"&gt;representation&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-9113207504719290677?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9113207504719290677'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/9113207504719290677'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/04/aliens-ruler.html' title='The Alien&apos;s Ruler'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-3903218257318509493</id><published>2007-03-30T16:44:00.000-04:00</published><updated>2007-03-30T17:58:01.817-04:00</updated><title type='text'>Digital Infrastructure for Collaborative Research</title><content type='html'>&lt;a href="http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html"&gt;Web 2.0&lt;/a&gt; teaches us to think in terms of remixable web services that can join people together on the fly to create collective intelligences.  Such groups can readily solve problems that are well beyond the scope or capabilities of any particular individual, and thus will have important ramifications for future research.  One of the key opportunities for digital humanists is to find ways to harness this power in our collaborative work.&lt;br /&gt;&lt;br /&gt;Here I will lay out a few principles and ideas for such an infrastructure; as always, I'd be grateful for any feedback.&lt;br /&gt;&lt;br /&gt;1. &lt;span style="font-weight:bold;"&gt;Open access and open source&lt;/span&gt;.  The principle of open access is to make research results freely available to everyone.  Open access research reaches a greater audience than gated research and has more impact.  Errors are more readily caught.  Copies are more likely to be archived.  Social inequities are lessened.  The principle of open source is to make software and code freely available.  New tools can be built on the work of others, code is maintained indefinitely, and, again, errors are more readily caught.&lt;br /&gt;&lt;br /&gt;2. &lt;span style="font-weight:bold;"&gt;APIs&lt;/span&gt;.  At a low-level, the creation of application program interfaces allows content providers to share information in such a way that it can easily be reused.  It also becomes much easier for tool makers to integrate different sources of information as necessary, or as they become available.  For example, once Google provides a web service that maps things, and OCLC provides a web service that gives the location of library books, it is quite easy to create a "mashup" that plots the location of books on a map.  Any serious infrastructure for the digital humanities will require the low-level cooperation of researchers and repositories of cultural heritage.  One mechanism by which this may be accomplished is the creation and distribution of open source APIs for catalog software.  Then, when a library, archive or museum makes their catalog available online to human searchers, they will also be making it available to mashup developers.&lt;br /&gt;&lt;br /&gt;3. &lt;span style="font-weight:bold;"&gt;Information dashboards&lt;/span&gt;.  Individual researchers will need an interface that shows them the state of the field at a glance and facilitates communication with colleagues.  Presumably they will have an account with a customizable homepage that supports RSS feed remixing, collections of shared and private documents, access to text, audio and video communications channels, and other tools.  The system will flag information that they haven't seen yet.  New widgets can be added to support tasks like &lt;a href="http://www.amazon.com/Information-Trapping-Real-Time-Research-Web/dp/0321491718/"&gt;information trapping&lt;/a&gt;, customized search, data mining, and visualization.&lt;br /&gt;&lt;br /&gt;4. &lt;span style="font-weight:bold;"&gt;Browser-based tools&lt;/span&gt;.  For work with online sources, researchers will also make use of tools that are built into the browser, like &lt;a href="http://www.zotero.org/"&gt;Zotero&lt;/a&gt;.  At the moment, Zotero allows you to automatically scrape bibliographic information from a webpage that you are viewing into a citation database.  Future releases will allow you to do things like synchronize your own database with a server, share citations with colleagues, or provide an RSS feed of your recently tagged items.  By making the server-side infrastructure compatible with Zotero, both tools will become more powerful.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/application+program+interface" rel="tag"&gt;application program interface&lt;/a&gt; | &lt;a href="http://technorati.com/tag/browser" rel="tag"&gt;browser&lt;/a&gt; | &lt;a href="http://technorati.com/tag/computer+supported+collaborative+work" rel="tag"&gt;computer supported collaborative work&lt;/a&gt; | &lt;a href="http://technorati.com/tag/data+mining" rel="tag"&gt;data mining&lt;/a&gt; | &lt;a href="http://technorati.com/tag/digital+history" rel="tag"&gt;digital history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/information+dashboard" rel="tag"&gt;information dashboard&lt;/a&gt; | &lt;a href="http://technorati.com/tag/mashups" rel="tag"&gt;mashups&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+access" rel="tag"&gt;open access&lt;/a&gt; | &lt;a href="http://technorati.com/tag/open+source" rel="tag"&gt;open source&lt;/a&gt; | &lt;a href="http://technorati.com/tag/zotero" rel="tag"&gt;zotero&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-3903218257318509493?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3903218257318509493'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/3903218257318509493'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/03/digital-infrastructure-for.html' title='Digital Infrastructure for Collaborative Research'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-5368544886244252823</id><published>2007-03-18T16:01:00.000-04:00</published><updated>2007-03-18T16:47:24.560-04:00</updated><title type='text'>Phidgets: Introduction</title><content type='html'>Around 2001, University of Calgary computer scientists &lt;a href="http://pages.cpsc.ucalgary.ca/~saul/wiki/pmwiki.php"&gt;Saul Greenberg&lt;/a&gt; and his student Chester Fitchett found that their efforts to create new interactive media spaces were impeded by a "quagmire of tediousness."  Rather than spending their time designing the kinds of interactions that they were interested in, they had to focus on the disparate elements that made up their demonstrations, things like electrical circuits, components, low-level programming, wiring, and so on.  Graphical user interfaces already had software components called "&lt;a href="http://en.wikipedia.org/wiki/Widget_%28computing%29"&gt;widgets&lt;/a&gt;."  These are reusable elements like windows, radio buttons, pull-down menus and check boxes.  Once one person has figured out how to implement a widget, other people can add that widget to their interface with much less effort.  Wouldn't it be great, Greenberg and Fitchett &lt;a href="http://grouplab.cpsc.ucalgary.ca/papers/2001/01-Phidgets.CHIWorkshop/phidgets.report2001-681-04.pdf"&gt;asked&lt;/a&gt;, if there were such a thing as physical widgets?  Thus &lt;a href="http://www.phidgets.com/"&gt;phidgets&lt;/a&gt; were born.&lt;br /&gt;&lt;br /&gt;You can now buy reusable physical components to measure light, sound, temperature, pH, vibration, force, human touch and motion and a host of other things in the real world.  You can drive servo motors and interact with radio frequency ID chips (&lt;a href="http://www.rfidjournal.com/faq"&gt;RFIDs&lt;/a&gt;).  There are two- and three-axis accelerometers (the same kind of device used in the remote of the new Nintendo &lt;a href="http://us.wii.com/"&gt;Wii&lt;/a&gt;).  These phidgets are easily combined and recombined because, in the words of Greenberg and Fitchett, "they hide implementation and construction details while exposing functionality through a well-defined API."  In other words, you don't have to know how the magnetic sensor works at a physical or electrical level, you just need to know that if you plug it in, it will return a value that represents the strength of nearby magnetic fields.&lt;br /&gt;&lt;br /&gt;I've recently gotten a large collection of phidgets to use for both research (more about that later) and for public and digital history student projects.  Eventually our students will be able to design and implement things like interactive exhibits and &lt;a href="http://digitalhistoryhacks.blogspot.com/2007/03/coming-soon-history-appliances.html"&gt;history appliances&lt;/a&gt;.  In the meantime, however, I have to make sure that I know how they work and that the lower-level scaffolding is in place.  Phidgets don't come with &lt;a href="http://www.python.org"&gt;Python&lt;/a&gt; support right now, so I've had to use &lt;a href="http://docs.python.org/lib/module-ctypes.html"&gt;ctypes&lt;/a&gt; to dig into the dynamically linked C libraries.  I know I'm not the only pythonista who wants to hack phidgets right now, so for the benefit of others, here are some partially implemented wrappers for the &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/phidgets-interface-kit-demo.py.html"&gt;8/8/8 interface kit&lt;/a&gt; and the &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/phidgets-quad-servo-demo.py.html"&gt;servo 4-motor kit&lt;/a&gt;.  Comments and improvements are always welcome, of course.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/phidgets" rel="tag"&gt;phidgets&lt;/a&gt; | &lt;a href="http://technorati.com/tag/python" rel="tag"&gt;python&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-5368544886244252823?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5368544886244252823'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/5368544886244252823'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/03/phidgets-introduction.html' title='Phidgets: Introduction'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1227346969830251184</id><published>2007-03-17T10:21:00.001-04:00</published><updated>2007-03-17T11:14:22.935-04:00</updated><title type='text'>Idioms</title><content type='html'>Part of learning how to program is learning a set of &lt;span style="font-style:italic;"&gt;idioms&lt;/span&gt;, conventional ways of solving recurring problems.  At first you have to think about these explicitly, but over time they become second nature, part of the way you see the world and act in it.&lt;br /&gt;&lt;br /&gt;For example, suppose you want to create a toggle, a switch that goes to its opposite value when you press it.  If it is on, it should turn off, and if it is off, it should turn on.  (And by convention, let's assume On = True = 1 and Off = False = 0.)  Many beginning programmers write something like this: IF SWITCH IS 1 THEN SET SWITCH TO 0 ELSE SET SWITCH TO 1.  There is nothing wrong with this code; it is easy to understand and works fine.  More experienced programmers have learned (or discovered for themselves) that they can say SET SWITCH TO 1-SWITCH instead.  So if the value of the switch is 0, then 1-0=1 and if the value of the switch is 1, then 1-1=0.  Same behavior, different idiom.&lt;br /&gt;&lt;br /&gt;In fact, the idea of a toggle is fundamental enough that it recurs across domains.  In electronics, for example, there is a well-known circuit called the &lt;a href="http://en.wikipedia.org/wiki/Flip-flop_(electronics)"&gt;flip-flop&lt;/a&gt;.  It can be implemented in different ways (i.e., with different idioms) but all flip-flops can store a single &lt;a href="http://en.wikipedia.org/wiki/Bit"&gt;bit&lt;/a&gt;, a single On or Off value.  Miniaturized and in bulk, these tiny toggles serve as memory for computers.&lt;br /&gt;&lt;br /&gt;In graphical user interfaces, the well-known &lt;a href="http://en.wikipedia.org/wiki/Check_box"&gt;check box&lt;/a&gt; is a toggle.  If you take apart retractable ballpoint pens, you will find a number of different mechanical idioms for implementing two states for the nib: In and Out.  Although we don't often think about retractable pens in these terms, each has at least a single bit of memory.  If you dance the foxtrot, a series of &lt;a href="http://www.centralhome.com/ballroomcountry/foxtrot_steps1.htm#Left-turn"&gt;left turns&lt;/a&gt; can make you and your partner toggle forwards and back.  The same effect can be achieved in other styles of dance with other idioms.&lt;br /&gt;&lt;br /&gt;The idea of a toggle becomes part of a designer's toolkit, regardless of the expressive medium he or she is working in: circuitry, software, human interaction, mechanics, dance.  As computing becomes &lt;a href="http://en.wikipedia.org/wiki/Ubiquitous_computing"&gt;pervasive&lt;/a&gt;, digital historians will need to find and master the idioms that best help them convey a sense of the past to their different audiences.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/bits" rel="tag"&gt;bits&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pedagogy" rel="tag"&gt;pedagogy&lt;/a&gt; | &lt;a href="http://technorati.com/tag/programming" rel="tag"&gt;programming&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1227346969830251184?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1227346969830251184'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1227346969830251184'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/03/idioms.html' title='Idioms'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-771737855984826126</id><published>2007-03-15T10:01:00.001-04:00</published><updated>2008-12-29T17:30:54.639-05:00</updated><title type='text'>How To: Do Simple Visualizations</title><content type='html'>One of the hats that I wear requires me to think strategically about the ways that Western's &lt;a href="http://history.uwo.ca/"&gt;history department&lt;/a&gt; and &lt;a href="http://history.uwo.ca/gradstudy/publichistory/"&gt;public history program&lt;/a&gt; are positioned on the web.  I spend a certain amount of time studying other departments and programs: who have they hired recently? what grants did they apply for and receive? where and what are they publishing? where do their students end up? what kind of web traffic do they have? which parts of their website are most dynamic? how are they positioned in search engine results?&lt;br /&gt;&lt;br /&gt;On a larger scale, the efforts of any particular department or program are made against the overall output of their university.  In &lt;span style="font-style:italic;"&gt;&lt;a href="http://www.amazon.com/Life-Style-Bruce-Mau/dp/0714845205/"&gt;Life Style&lt;/a&gt;&lt;/span&gt;, Bruce Mau writes that "The exercise of producing identity is all about giving environmental noise a defined pattern."  So if we think of a departmental identity as a pattern, we can think of the university as providing the environmental noise (or the carrier wave, if you prefer) that is being modulated.&lt;br /&gt;&lt;br /&gt;The point of visualization is to use your eyes to find interesting patterns in data that you might otherwise miss.  IBM's &lt;a href="http://services.alphaworks.ibm.com/manyeyes/app"&gt;Many Eyes&lt;/a&gt; web service makes it easy to do simple visualizations without any programming.  For example, in order to assess the relative positioning of Canadian universities, I started by gathering &lt;a href="http://www.macleans.ca/universities/tool_research.jsp"&gt;reputation data&lt;/a&gt; from &lt;span style="font-style:italic;"&gt;Macleans&lt;/span&gt; magazine.  Universities, like other institutions, tend to be circumspect about providing data that their competitors could use against them.  Nevertheless, I was able to collect information about unique US visitors to Canadian university websites using Compete's &lt;a href="http://snapshot.compete.com/"&gt;Snapshot tool&lt;/a&gt;.  Uploaded to Many Eyes, these data are easily plotted against one another as a &lt;a href="http://services.alphaworks.ibm.com/manyeyes/view/Sh3S9FsOtha6IiEzSlHGF2-"&gt;scatterplot&lt;/a&gt;. You can hover over a particular datum with the mouse cursor to learn more about that point.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlPaiZFZZI/AAAAAAAAAEg/aenPukvglrU/s1600-h/many-eyes-00.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 145px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlPaiZFZZI/AAAAAAAAAEg/aenPukvglrU/s200/many-eyes-00.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285342955023197586" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;Overall, the graph suggests that the universities that have a higher reputation (at least according to &lt;span style="font-style:italic;"&gt;Macleans&lt;/span&gt;) also tend to receive more US visitors to their websites.  The graph also shows that highly ranked Francophone universities like &lt;a href="http://www.usherbrooke.ca/"&gt;Sherbrooke&lt;/a&gt; tend to generate very low US traffic.  (As we say in English, "quel surprise!"  Dan Cohen recently &lt;a href="http://www.dancohen.org/blog/posts/its_about_russia"&gt;argued&lt;/a&gt; that it's pretty silly to use visualization to discover that Jesus is a big deal in the New Testament.)&lt;br /&gt;&lt;br /&gt;We would expect big universities like &lt;a href="http://www.utoronto.ca"&gt;Toronto&lt;/a&gt; or &lt;a href="http://www.yorku.ca/web/index.htm"&gt;York&lt;/a&gt; to generate a much larger web presence than tiny ones like &lt;a href="http://www.cbu.ca/cbu/_main/home.asp"&gt;Cape Breton&lt;/a&gt;.  So we really should try to take the size of the institution into account somehow.  I added 2005 &lt;a href="http://www.aucc.ca/publications/research/enrol_e.html"&gt;enrollment data&lt;/a&gt; from the Association of Universities and Colleges of Canada, and divided the number of website visitors by the number of students.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlPiq_FOVI/AAAAAAAAAEo/76Ez3W_n2t0/s1600-h/many-eyes-01.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 145px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlPiq_FOVI/AAAAAAAAAEo/76Ez3W_n2t0/s200/many-eyes-01.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285343094769006930" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;This lets us see that something interesting is happening in a few places, notably at the &lt;a href="http://www.uvic.ca"&gt;University of Victoria&lt;/a&gt; and &lt;a href="http://www.acadiau.ca/"&gt;Acadia&lt;/a&gt;.  Both schools have far more US website visitors than we would expect given the size of the institution.  UVic is home to the wonderful &lt;a href="http://www.canadianmysteries.ca/indexen.html"&gt;Great Unsolved Mysteries in Canadian History&lt;/a&gt; site, and to a vibrant &lt;a href="http://hcmc.uvic.ca/"&gt;humanities computing&lt;/a&gt; group.  Web-savvy may run in the blood there.  Acadia's front page provides access to &lt;a href="http://www.youtube.com/profile?user=AcadiaWebmaster"&gt;school news via YouTube&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/visualization" rel="tag"&gt;visualization&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-771737855984826126?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/771737855984826126'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/771737855984826126'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/03/how-to-do-simple-visualizations.html' title='How To: Do Simple Visualizations'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlPaiZFZZI/AAAAAAAAAEg/aenPukvglrU/s72-c/many-eyes-00.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-8441231120679025333</id><published>2007-03-12T20:05:00.000-04:00</published><updated>2007-03-12T21:21:34.860-04:00</updated><title type='text'>Coming Soon: History Appliances</title><content type='html'>Imagine wandering into your living room after a day of work.  You sit down in your chair and turn a dial to 1973.  The stereo adjusts automatically, streaming Bob Marley, Elton John, Stevie Wonder and Jim Croce.  LCD panels hanging on the wall switch to display Roberto Matta's &lt;span style="font-style:italic;"&gt;Jazz Bande&lt;/span&gt; and Elizabeth Murray's &lt;span style="font-style:italic;"&gt;Wave Painting&lt;/span&gt;.  If you check your TV listings, you'll find &lt;span style="font-style:italic;"&gt;Mean Streets, Paper Moon, American Graffiti, The Sting, Last Tango in Paris&lt;/span&gt; ... even &lt;span style="font-style:italic;"&gt;Are You Being Served?&lt;/span&gt;  In your newspaper you find stories about the cease-fire in Vietnam, about Watergate, about Skylab, about worldwide recession and OPEC and hostilities in the Middle East.  If you want to read a novel instead, you might try &lt;span style="font-style:italic;"&gt;Gravity's Rainbow&lt;/span&gt; or &lt;span style="font-style:italic;"&gt;Breakfast of Champions&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;With a little hacking, most of this scenario could be easily accomplished today.  The dial, for example, could be implemented with a &lt;a href="http://www.phidgets.com/index.php?module=pncommerce&amp;func=itemview&amp;IID=113"&gt;Phidgets circular touch sensor&lt;/a&gt;. Once you set the date, your computer could respond by switching your media player to a particular playlist, sending images to the &lt;a href="http://electronics.howstuffworks.com/digital-picture-frame.htm"&gt;picture frames&lt;/a&gt;, and feeding a stream of news from the &lt;a href="http://news.google.com/archivesearch"&gt;Google News Archive&lt;/a&gt; to your &lt;a href="http://www.eink.com/"&gt;E Ink&lt;/a&gt; reader.  Both the Pynchon and Vonnegut novels already exist in digital form.  It's only a matter of time until you can download them for a fee or print them on demand.  As TV moves online, it will also become easy to filter offerings according to user-determined criteria.  In short, access to the infinite archive makes it easy to immerse yourself in sources from a particular milieu.&lt;br /&gt;&lt;br /&gt;Public historians will be able to find new roles designing historically accurate and interesting experiences for consumers, selecting from the welter of sources across media, combining and interpreting them.  We tend to think of historical production in terms of writing books, but there will be many more "&lt;a href="http://books.google.com/books?vid=ISBN0710030290&amp;q=%22machine+to+think+with%22&amp;dq=%22machine+to+think+with%22&amp;pgis=1"&gt;machines to think with&lt;/a&gt;."  The most subtle will shape our historical consciousness by working in &lt;a href="http://www.ubiq.com/hypertext/weiser/acmfuture2endnote.htm"&gt;the periphery&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/ambience" rel="tag"&gt;ambience&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historical+consciousness" rel="tag"&gt;historical consciousness&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/pervasive+computing" rel="tag"&gt;pervasive computing&lt;/a&gt; | &lt;a href="http://technorati.com/tag/phidgets" rel="tag"&gt;phidgets&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-8441231120679025333?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8441231120679025333'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/8441231120679025333'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/03/coming-soon-history-appliances.html' title='Coming Soon: History Appliances'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-1501070072335963376</id><published>2007-03-06T15:48:00.001-05:00</published><updated>2008-12-29T17:26:33.298-05:00</updated><title type='text'>Design for a Kiosk in a Cabinet</title><content type='html'>Last year the Western Social Science building was renovated to include shallow glass-fronted display cabinets for each department. A small group of my colleagues gathered in front of ours to discuss how best to portray the department to students and visitors. Some thought that using the cabinet to display a collection of faculty-authored monographs would be boring ... which is funny, because writing such monographs is crucial for promotion and tenure in our department.  Anyway, the general consensus was that it would be nice to have something fun, something animated, maybe even interactive. &lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlOMYPp4AI/AAAAAAAAAEI/bHupiNI_tvI/s1600-h/cabinet.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 140px;" src="http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlOMYPp4AI/AAAAAAAAAEI/bHupiNI_tvI/s200/cabinet.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285341612269494274" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The cabinet poses two main design constraints.  The first is that it is shallow, so there is room to fit a tablet computer, but not a laptop or desktop.  It would also be difficult to fit in an LCD projector unless you wanted to do something &lt;a href="http://www.kirchersociety.org/blog/?p=251"&gt;catoptric&lt;/a&gt;.  The second constraint is that there is an air gap between the user and the display.  So they can't touch anything, unlike a regular kiosk.  Interactions have to be mediated by photons, or radio waves, or sound.  The cabinet is in a busy hallway near people's offices, so sound could get really irritating.&lt;br /&gt;&lt;br /&gt;I recently read &lt;a href="http://www.amazon.com/Analog-Digital-Out-Brendan-Interaction/dp/0321429168/"&gt;&lt;span style="font-style: italic;"&gt;Analog In, Digital Out&lt;/span&gt;&lt;/a&gt; by the interaction designer Brendan Dawes.  He's a big fan of using webcams in his designs, and I got to thinking that a camera would be one way to bridge the gap.  It would also mean that the user's own image has to become part of the design, but we can use that.  Since this will be a public history project, one idea that we might try to convey is situating the user in the flow of time.  By themselves, glass display cases superimpose a ghostly image of the viewer onto the contents of the case, so we will really be doubling this effect.  (For more on display cases, see Martin Roberts, "Mutations of the Spectacle: Vitrines, Arcades, Mannequins," &lt;span style="font-style: italic;"&gt;French Cultural Studies&lt;/span&gt; 2 (1991): 211-249.)&lt;br /&gt;&lt;br /&gt;I also like the idea of working with flows, something I've been thinking a lot about lately, whether in the form of &lt;a href="http://www.amazon.com/Information-Trapping-Real-Time-Research-Web/dp/0321491718/"&gt;RSS feeds&lt;/a&gt;, &lt;a href="http://www.amazon.com/Ecological-Approach-Visual-Perception/dp/0395270499/"&gt;perceptual experiences of place&lt;/a&gt;, or &lt;a href="http://www.amazon.com/Thousand-Plateaus-Capitalism-Schizophrenia/dp/0816614024/"&gt;lines of flight&lt;/a&gt;.  So we're aiming for a kiosk design that incorporates the user's image, uses a webcam for interaction, and conveys historical flow in some way.&lt;br /&gt;&lt;br /&gt;Here's what I came up with.  In the cabinet there's a tablet computer with a webcam facing the user.  A series of historical images flows past in chronological order, partially transparent and projected over the user's image.  When they reach up to "touch" one of the images, the flow stops and that image is displayed in detail.  They're not really touching anything, of course.  The camera detects the motion of their hand in the air, and they see themselves touching something on the screen.  Here are some screen shots from my demo:&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlOV8mp4TI/AAAAAAAAAEQ/GPN3NroGjSI/s1600-h/kiosk01.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 91px;" src="http://4.bp.blogspot.com/_mg_RqiBYrpE/SVlOV8mp4TI/AAAAAAAAAEQ/GPN3NroGjSI/s200/kiosk01.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285341776648462642" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlOfuYdYBI/AAAAAAAAAEY/THLuthz0rnk/s1600-h/kiosk02.jpg"&gt;&lt;img style="cursor:pointer; cursor:hand;width: 200px; height: 91px;" src="http://3.bp.blogspot.com/_mg_RqiBYrpE/SVlOfuYdYBI/AAAAAAAAAEY/THLuthz0rnk/s200/kiosk02.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5285341944629518354" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;You can download the Python source &lt;a href="http://digitalhistory.uwo.ca/dhh/hacks/kiosk.py.html"&gt;here&lt;/a&gt;.  For my demo I used a lovely series of &lt;a href="http://www.mccord-museum.qc.ca/en/keys/virtualexhibits/magiclantern/"&gt;lantern slides from the McCord Museum&lt;/a&gt;.  Although the webcam motion detector interface works, I didn't implement the flow part because in an actual installation I would want to use the wireless capabilities of the tablet to tap into an RSS feed of images.  That would require a little more programming.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;About this post&lt;/span&gt;. A couple of months ago I was tagged with a meme and passed it on to &lt;a href="http://airminded.org/2007/02/20/good-memes/"&gt;Brett Holman&lt;/a&gt;, among others, who responded by tagging me with a different meme: the &lt;a href="http://ilkeryoldas.blogspot.com/2007/02/thinking-blogger-awards_11.html"&gt;Thinking Blogger&lt;/a&gt; award.  Under the original terms of the award, I'm supposed to nominate five blogs which make me think.  Instead, I decided to use five blog posts / webpages to inspire a design and teach myself how to do something new.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;My former digital history student and all-around humble guy Jeremy Sandor suggested that &lt;a href="http://jeremysandor.blogspot.com/2007/02/play.html"&gt;public history should be play&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Brendan Dawes showed me how to use &lt;a href="http://www.brendandawes.com/mt/archives/000162.html"&gt;Play-Doh as an interface&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;Guyon Moree put the pieces together in &lt;a href="http://gumuz.looze.net/wordpress/index.php/archives/2005/06/06/python-webcam-fun-motion-detection/"&gt;Python&lt;/a&gt;.&lt;/li&gt;&lt;li&gt;&lt;a href="http://www.archiva.net/hist697ay07/index.html"&gt;Paula Petrik&lt;/a&gt; and &lt;a href="http://clioweb.org/archive/2007/01/31/readings-in-design-for-digital-humanities/"&gt;Jeremy Boggs&lt;/a&gt; continue to emphasize the role of design in digital history.&lt;/li&gt;&lt;li&gt;And the &lt;a href="http://thirdview.org/3v/rephotos/index.html"&gt;ThirdView Rephotographs&lt;/a&gt; suggest a future direction for designs like this one.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;You're it!&lt;br /&gt;&lt;br /&gt;Tags: &lt;a href="http://technorati.com/tag/flows" rel="tag"&gt;flows&lt;/a&gt; | &lt;a href="http://technorati.com/tag/hacking" rel="tag"&gt;hacking&lt;/a&gt; | &lt;a href="http://technorati.com/tag/historical+photographs" rel="tag"&gt;historical photographs&lt;/a&gt; | &lt;a href="http://technorati.com/tag/interaction+design" rel="tag"&gt;interaction design&lt;/a&gt; | &lt;a href="http://technorati.com/tag/public+history" rel="tag"&gt;public history&lt;/a&gt; | &lt;a href="http://technorati.com/tag/Python" rel="tag"&gt;Python&lt;/a&gt; | &lt;a href="http://technorati.com/tag/webcams" rel="tag"&gt;webcams&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/19974449-1501070072335963376?l=digitalhistoryhacks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1501070072335963376'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/19974449/posts/default/1501070072335963376'/><link rel='alternate' type='text/html' href='http://digitalhistoryhacks.blogspot.com/2007/03/design-for-kiosk-in-cabinet.html' title='Design for a Kiosk in a Cabinet'/><author><name>William J. Turkel</name><uri>http://www.blogger.com/profile/05033419379580138964</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='25' height='32' src='http://2.bp.blogspot.com/_mg_RqiBYrpE/SNT9VA6bvpI/AAAAAAAAAAY/gR5qgqKYCKI/S220/wjturkel.jpg'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_mg_RqiBYrpE/SVlOMYPp4AI/AAAAAAAAAEI/bHupiNI_tvI/s72-c/cabinet.jpg' height='72' width='72'/></entry><entry><id>tag:blogger.com,1999:blog-19974449.post-6570815156304495934</id><published>2007-02-25T10:33:00.000-05:00</published><updated>2007-02-25T11:50:41.708-05:00</updated><title type='text'>What's in the Other Corner?</title><content type='html'>I've just returned from an interdisciplinary workshop at Indiana University on "putting  memory in place."  The organizers wanted to explore the ways that memories and places are linked, the forces that lead to individual or social forgetting, and the potential role for technology in resisting these forces.  The presentations and discussions were excellent.  One of the things that I found most interesting about the workshop was that an &lt;a href="http://www.interaction-design.org/"&gt;interaction design&lt;/
