Notes: Electronic Piers Plowman: Implementing an Edition of a Six-Hundred-Year-Old-Poem for Twenty-First Century Students

These are my own notes from the Research Conversation about representing Piers Plowman today, March 7. Presented by Terry Brooks and Miceal Vaughan. (Note (3/11/2008): I went back in and cleaned some of the formatting up on this, since apparently Windows Live Writer is not quite as consistent as I’d like.)

Tweeting

I’ll credit Zach Hale for first making me wonder why the hell Twitter was really even worth thinking about (though I can’t appear to locate my original comment on his blog to that effect). 

After much resistance, I’ve finally set up my Twitter account (you can find it on my Profiles menu on this site’s navigation bar).  Why?  This series of articles had a lot to do with it, but I also decided that I’d take a page from the book of one of my co-workers, Martin Criminale, and at least try throwing my hat in the ring.  And, of course, Zach had a bit to do with it.

Now if they only had an import option that allowed me to upload contacts without sending out invitations (the Gmail contact import doesn’t appear to be working for me at this point).  I’m also curious about whether it might be possible to integrate my blog posts and my Twitter posts in such a way that they all appear in a continuous stream on this page (without necessarily being an entry in my WordPress RSS feed).  It’s probably doable, just a question of figuring out how.  Tweets, as they call Twitter entries, would have to be indicated, but that’s not overly hard.  Perhaps a combination of SimplePie and my standard WordPress template code?

Tautologies

A discussion in IMT 530 reminded me of tautologies – essentially, logical assertions based on variables.  Using tautologies, you can construct what are called truth tables – tables that show when a particular condition holds.  Thus, if I treat two variables – A and B as boolean values (true/false), then ask what happens when we apply the AND operation and OR operation to these two variables separately, you end up with a table that looks like this:

A B A AND B A OR B
T T T T
T F F T
F T F T
F F T F

This skips the formal notation.  You can go further – there are inference notations, NOT notations (an inversion), and I believe there may also be NOR and NAND (not or and not and), though these operations may simply be a combination of the AND/NOT or OR/NOT formulations rather than formal expressions.

Résumé Updated

My résumé has been updated. I’m starting to wonder whether I need to trim the damned thing, since it does seem like there’s a lot on there, and some of it may stop being entirely relevant after a certain period of time. I’m still very proud of being Eastside Journal’s Most Inspirational Graduate of 2001, but how long does a high school graduation award actually matter? This is a bit of a trickier question, since I’m still in school. I’ve had people look at that document and think it way too long, while others think it proves that I have a vast array of experience (let’s ignore my personal reaction to that last opinion for the moment).

Firefox Speed Tweaks

I found these tips on speeding up Firefox here, and it does seem to speed it up significantly even on broadband. However, a couple of the flags (I suspect) refer to older Firefox versions than what I’m currently running (2.0.0.12). Here are the ones I set in the “about:config” screen:

  • network.http.max-connections: 48
  • network.http.max-connections-per-server: 16
  • network.http.max-persistent-connections-per-proxy: 8
  • network.http.max-persistent-connections-per-server: 4
  • network.http.pipelining: true
  • network.http.pipelining.maxrequests: 100
  • network.http.proxy.pipelining: true
  • Creating a new integer value:

    nglayout.initialpaint.delay: 0

I guess my only question at this point is (a) how much these settings increase the load on web servers and (b) whether these are changes that should really be made. It seems like most of the boost same from the last new integer value, if anything at all sped it up, since painting is now nearly instantaneous. All the other flags do is increase the number of connections the browser is allowed to make (and how it’s allowed to make them, if I’m understanding the pipelining setting properly). Is there any documentation on about:config values?

My Personal Information Management: Not Managed, Really

Something quite interesting popped into my head, and thus prompted this post.  As most know, I do a lot of reading as a part of my masters studies, and have done a lot of reading in the past regarding a host of different topics, particularly during my undergraduate work at Evergreen.  Oddly, when I’m doing academic work, I almost never like to read anything else, since my energies tend to get a bit drained from having to keep up with the academic stuff in the first place — there’s residual effect as well in that I seem to not like reading much for time periods after the academic year has ended.  Regardless, I find myself in a bit of a quandary; I’ve done a lot of reading on the subjects of sustainability and information management, but I really have no method as it stands of referencing all of that information or even recalling where something in particular cropped up.

This is a big problem, and spans a lot of different resources: textbooks, class notes, handouts, technical articles, magazine articles, programming code snippets, old web site designs, even in-line notations on whatever I’m reading.  I come up with ideas for projects that (no pun intended) peter out (cough) after a while, either for lack of motivation or for lack of appropriate reference material – in general, it tends to be more the former than the latter, but lack of reference material also rears its ugly head occasionally.  This isn’t because I lack the information; it’s because I’ve seen it somewhere but can’t find it again!

I’m not the only one.  Not by a long shot.  Everyone faces this.  I have a slight advantage in that I’m beginning to recognize some of the ways that this is solvable, but at a slight disadvantage in that I am not quite as involved with stuff like social tagging or folksonomies — though I should note that Wikipedia has it wrong; folksonomies and social tagging are not the same thing, and saying they are is misleading.  Anyway, the main reason I have a problem is that I don’t have a quick way of finding any annotations or relevant readings for a particular topic.  If I wanted to remember a bit about economics, for instance (a highly relevant subject for me at the moment because of PB AF 594), I don’t have any way of knowing what articles I’ve read related to the subject or where my books are that cover that subject or what I might’ve taken as notes in classes three or four years ago that talked about the subject.  This is partly lack of time to look all this crap up.  This is also partly because that requires locating things – like my ink in my last blog post, I may not know it’s already around or may think I loaned the book on the subject to someone else.  I actually thought I had loaned one of my economics books to my mother (don’t ask me why I thought this) until I spotted it going to bed one night on a bookshelf directly across from the bed!

I’ve tried recently to reduce the amount of stuff I hang on to that makes it harder to find things.  I’ve started a “clippings binder”, where I rip out magazine articles that I think might be useful for future reference and recycle the rest of the magazine.  I can’t bring myself to do this for my copies of eco-structure, since those are just pure gold, but most of the other magazines I have floating around succumb to this sooner or later.  I can’t do this to books (and won’t – my father, who is doubtlessly reading this, would about have a conniption and ban me to the seventh or eighth layer of hell).  Last year before moving to Seattle, I donated a bunch of (admittedly mostly fiction) books to Olympia’s Goodwill branch to reduce the number of books I had sitting around.  But really, this hasn’t done much – I still have a lot of books I want to be able to reference.

There’s an extra dimension here – not only is there stuff I have read, but there’s stuff that looks relevant that I want to read, but can’t find the time.

It seems like the only really good way of doing this would be to start creating additional notes on every single book I read that might be relevant to future work, but that in and of itself is a lot of additional work.  Would it increase my ability to look for and find information?  Probably, especially if it were implemented correctly (I’d guess a wiki system with some sort of tagging grafted on would work quite well for this).  Perhaps I’ll take a sabbatical in 2009 after I graduate and spend the summer reading and making notes and putting them into some coherent system.  Yeah, right.  So how do we organize all these resources that we personally find relevant?  There are answers — maybe — and those answers are (fairly) likely to be relevant.  But in the meantime, if I want to remember all I’ve seen on sustainability, I’ll have to read it all over again, or at least spend a copious amount of time reading over whatever notes I made in the margins of books or on paper somewhere in a binder buried in my closet.

That’s assuming those notes existed at all, and that’s a whole ‘nother problem.

Ink

I’ve been thinking I needed printer ink for the last several weeks, since my printer is reporting that several of the cartridges are getting quite low. I had intended to order some tonight, and nearly did until I opened my filing cabinet and found refills for every single ink cartridge I have.

Well, at least I found the cartridges before I ordered new ones…

Note – I use a business-level printer that does duplexing and provides an insane amount of paper storage capacity (and it’s got a wireless connection built in to boot) – why do I use something with that much power? Home-use printers seem to fall a bit short in the areas of networking and duplexing, thus I went to business models. This is an HP OfficeJet Pro K550dwtn (actually, it’s a K550dtwn), and thus far has served me quite well. It helps that I keep my need for ink down by forcing all printouts to only use black ink and to use the “Fast/Economical Printing” setting (which is essentially draft printing). There is no visually appreciable difference between draft printing and normal printing speeds, except that draft printing uses a lot less ink.

Notes: Using Uncensored Communication Channels to Divert Spam Traffic, January 31, 2008

This was a presentation given by Benjamin Chiao from the University of Michigan – he’s currently a PhD student at their Information School, but also has an economic background, which is where much of this talk was couched.

  • What’s the point of solving spam problem? Less time sorting spam, less economic cost for blocking spam, customers spend less money
  • $10 billion/year spent on spam related technologies
  • What is uncensored/open channel? keep inbox filters, no filters in special folder, guarantee delivery of messages into folder
  • Properly tagged messages will automatically be assigned to a folder/label
  • No new technological infrastructure required and fully reversible
  • Existing mechanisms to prevent spam: legal punishment, filters
  • Proposal of the open channel: decrease benefits of spamming by decreasing the number of recipients
  • Economics: micro-economic model shows open channels increase benefits to recipients, advertisers
  • This is not a unique mechanism – Chiao compared it to TV shopping channels: you don’t have to watch, but the information is constantly there
  • Open channel is like web sites – anyone can post
  • Not excluding the possibility of search within the open channel
  • Sender tags sent messages (as being part of the channel? This wasn’t clear in the talk)
  • The definition of spam used here specifically targets unsolicited commercial mass e-mails – no other message types are considered here
  • Current spam volumes are between 80-90% of total network traffic – 40% advertise medications, 19% is adult content, 41% other (according to Evett 2006
  • Spammers continue because they are economically supported – there’s a point where the supply of spam must meet demand
  • Why do we need open channel? Why not just search for the content via existing search engines? Sites selling these products disappear too quickly: 30% of domains created die within a day (according to MessageLabs 2005)
  • Spammers need to keep pushing information to inboxes because they must move rapidly due to legal reasons
  • 60% of spam messages are sent by zombies – computers hijacked for the explicit purpose of sending spam
  • The CAN-SPAM Act has essentially legalized spamming
  • The open channel proposal separates the current e-mail ecosystem into two ecosystems – one “open” (the proposal) and one “traditional” (the current model)
  • Audience observation: this system assumes that EVERY e-mail system implements the open-channel concept
  • Current technology already partially implements this idea (sort of)
  • Spammers might be happier on open channel! 😀
  • This is still a theoretical idea
  • Essentially create two channels: one open and one censored (I’m not clear on whether the “channels” are analogous to the “ecosystems” mentioned above)
  • E-mail recipients opt in to the open channel in order to maximize their own utility
  • The sender gets its current revenue from the advertising charge times the number of mails received
  • The sender’s current cost is the constant reestablishment of sending channels (zombies)
  • The open channel attempts to establish equilibrium between advertisers and receivers of spam (note that advertisers, senders, and receivers are independent parties)
  • There is not just a supply curve but a demand curve for UCM
  • The open channel method induces UCM to move out of the current e-mail system

I’m not sure Benjamin gave sufficient background to make any of us fully appreciate the idea – there’s two problems with it that I can see: first, it exists within the reality of economics, not the reality that we commonly deal with. Thus, it’s governed by the same economic laws that give me such a headache in PB AF 594, and understanding the concept requires a suspension of our own realities in order to appreciate the laws that govern the proposal. The second problem is that it’s not clear how this can be implemented within the current system. Is this a system that merely adds a tag to all messages that identify it as open-channel or “traditional”? How do you physically separate the two ecosystems without actually modifying the current e-mail structure, and how do you enforce proper usage of both ecosystems? An honor system in which we assume that the senders, the receivers, and the advertisers are all working to maximize their own utility (basically their net happiness) is perfect in economic theory because economic theory establishes that everyone will strive towards some theoretical maximum benefit, but in reality, it just doesn’t seem possible.

There was one thing that I want to follow up on – Benjamin mentioned the Attention-Bond Mechanism (Loder 2006) in his talk, so I’ll have to look up exactly what that entails (it’s a concept related to the acceptance or rejection of e-mail messages).