My Google AdSense Account: Moved to Where It Belongs
Honestly, I'm no longer sure how it happened, but suffice it to say that a few years ago I did something stupid.
No, no, it was nothing like that. I just applied for Google AdSense a few days before my actual 18th birthday. That, of course, netted me a declined application, because I was obviously still too young to participate in AdSense — but I wasn't counting on it also killing my ability to reapply later. When I tried again to sign up for AdSense using my main Google Account, after I was old enough, I got nothing but errors.
When I emailed Google AdSense Support about the problem, they said I could just reapply, but would have to use a different email address — meaning a different Google Account — to do so. I eventually did so, after I created an alias or two at Gmail, but I never used the approved account. I wasn't sure if I even wanted to try ads on this site, and I also had a hang-up about the principle of having one service in an account separate from all the others.
I did try to change the login email address associated with my approved AdSense account. The only problem was, my account's login couldn't be changed, because it was associated with a Gmail account. I was less than pleased, but figured it was a problem I could solve later.
And so it was: Last week, just in time for February break at college, I discovered that my primary Google Account again had the ability to apply for AdSense. Maybe there's some kind of expiration on declined applications; I haven't read enough of Google's policies to figure that out. (Who has the time, especially as a full-time college student?) So I did it: I reapplied for AdSense on my primary account.
Google's systems noticed that an active AdSense account already claimed my Payee Name. But instead of telling me I couldn't complete my application, as I expected, it asked if I wanted to transfer the account. What did I say? Yes!, of course.
I filled out a short form, got a bit of data from my approved AdSense account, agreed to forfeit my $0.00 of unpaid earnings in the old AdSense account,1 submitted, and waited. In less than an hour, I got confirmations addressed to both my old and new email addresses that my account had been transferred. I logged into AdSense using my main Google Account, and it worked.
Technically, what Google did was close my old account and open a new one associated with my primary Gmail address. That's why unpaid earnings below the payment threshold didn't transfer. If I had generated ad code using the old account, I would have had to replace it. Not having to deal with that made a simple process even easier.
Thanks, Google. Every so often, you do something that makes me really happy. This was one of those things.
This change affects my website in a small way: I'm testing AdSense ads in the places supported by LightWord, the WordPress theme I use, whose development I have kind of taken over.2 Since enabling the new ads (which only show on single posts) several days ago, I've seen exactly zero clicks. It'll be an interesting experiment to see if that changes.
- Earnings below the payment threshold of US$10 are forfeit in transfers. Not that I ever used my old account, so it couldn't possibly have any earnings. [↩]
- It's not like anyone has really seen my changes. I haven't gotten around to officially forking the code and releasing my own version under a different name. Doing that sounds like a summer project, maybe, depending on how busy I am, as it will involve updating the theme code to meet all of the current Theme Review guidelines. [↩]
Why I Will Not Use Seesmic, Ever
Update (03/03): This post garnered a response from a Seesmic employee, Yama, in the comments. From "figure out the best pricing model", I gather that pricing remains undecided, so I maintain my hope for a HootSuite-like freemium model. I'm also glad to hear that the green bar will be reviewed for possible improvements. Thank you, Yama; if I have more thoughts I will certainly email you.
Earlier this month, no doubt on or soon after February 6, 2012, I went to Ping.fm to find a green bar on top of the area where I usually clicked to log in and get on with posting things to my social networks. Seesmic, apparently, had other plans. They really wanted to make sure I heard about their new product, Seesmic Ping. They covered the login link with a green bar to make sure I'd notice it.
All right, fine, I went to have a look. I didn't feel like signing up for the new service, though. Instead, I dug up the blog post announcing Seesmic Ping, from February 6. Near the end, there was a very telling paragraph:1
For Ping.fm users – With the release of Seesmic Ping, we’ll look to maintain Ping.fm for some time. In the meantime, we encourage you to sign up for a Seesmic Profile and give Seesmic Ping a ride through our mobile applications or the web.
I wasn't the only one made uneasy by those two sentences. "for some time" really doesn't mean "indefinitely", and sure sounds like Seesmic will eventually kill Ping.fm entirely.

Source images: Question mark, Axe, logo from Ping.fm website
I've had complaints over the years with Ping.fm, occasionally with performance. But most of them came from decisions made by Seesmic, explicitly or not, after they acquired Ping.fm. They were things like:
- No new API keys for applications
- Disabling API keys for applications like the Shorten2Ping WordPress plugin, instead of blocking the users who were spamming
- No new services for years
- Issues with existing services, like Jaiku (which Google later shut down completely about a month ago)2
- Broken post-by-email3
Despite all the issues following the Seesmic acquisition, Ping.fm has remained solidly usable. But Seesmic has now announced a successor to Ping.fm — and what's more, they intend to charge for it (emphasis mine):4
We’ll look to have more features and services when Seesmic Ping comes out of beta as a paid service.
No pricing came with the announcement, just a notice that the new service would eventually cost money. I know we've all been spoiled by free Web services, and the money has to come from somewhere, but somehow I have my doubts that Seesmic will take an approach that is consumer-friendly. HootSuite has a great pricing model: Features that consumers will use (a few profiles, with one user who can manage them) are free; business-level features (more profiles, multiple-user collaboration) cost money. I don't think Seesmic Ping will follow that structure; if I had to guess, everyone will have to pay for it.
I mean, really, Seesmic could have made the green bar push the entire page down, instead of floating it over the four tabs at the top. Look at what it covers:
It floated on top of the page for a reason, I'm sure. Putting it there made me click on it to make it go away (it didn't). Then I read it, and followed the link. No doubt I followed the expected sequence of actions precisely. And that irritates me, because the green bar should have just looked like this:
I imagine that the reasoning went something like, "If it doesn't cover the login link, users will ignore it. No, displacing the login link by 40 pixels isn't enough; it has to actually be inaccessible. We will force users to read this bar on every single page." Oh yeah, it pops up on every single page view. Home, login, Dashboard, settings, you-name-it — green bar ALL the pages... for lack of a better X all the Y idea.
There was also an email newsletter sent out on February 15, announcing Seesmic Ping, which I read after going through the whole "green bar" thing. It too addressed the future of Ping.fm... sort of:5
Like many of you, we appreciate the passion that Ping.fm brings, and made sure to carry over its core value of the simplicity in posting. With the launch of Seesmic Ping, we continue to enhance this service with reliability and robustness, while offering key features such as scheduling and the ability to post to multiple Twitter accounts and Facebook pages.
Eventually, Seesmic Ping will be a paid service. While in beta, Seesmic Ping is free to access. If you have any feedback, please tell us what you think: feedback.seesmic.com.
The email announcement carefully avoided any mention of shutting down Ping.fm. The original blog post never changed, though, so the plans are certainly still in place.
This state of affairs is really disappointing, because I've used Ping.fm as a staple of my online life for, literally, years. According to TweetStats, I've posted from Ping.fm more than I have from Twitter.com. (twhirl is still on top because I used to have it open all the time back in high school.) I post from the Web, from a third-party app on my Android phone, via SMS, and I used to use email posting from my mother's cell phone back before I had my own. In short, I use Ping.fm a lot. It still is the best option I've found on the market for cross-posting to different social networks.
If When Ping.fm goes away, I'll probably end up switching to Hellotxt. Hellotxt has its own share of issues at the moment, including a lot of services that are disabled and a significant slowness to the site, but it's still the best alternative to Ping.fm. I can also just roll up my sleeves and build my own personal system, since all of the sites I use provide free API access, but I'd rather not take the time to do that. It would also load my (very) shared server and lack a lot of features like posting via SMS6 and scheduled posting.7 Could I implement them? Sure. Would I take the time? Questionable. Additional features also mean additional server load, and so on.
The point is, I have only one practical alternative — Hellotxt — because building my own is hard, time-consuming, and unlikely to happen any time soon. I dream that Seesmic will change plans and decide not to kill Ping.fm, but the reality is that it's almost certain to happen and the only question is when. Hopefully Hellotxt will have its issues worked out by then and will be ready to take over as king of the cross-posting niche. It would certainly serve Seesmic right if Ping never went anywhere, and that might be worth losing Ping.fm.
As for never using Seesmic, ever, well, let's just say I oppose the way they do things. I don't like it when a company buys another company, takes the ideas and technology from existing products, and then shuts down the old company's services. Google does that a lot, and those are the times when I come closest to hating Google. The difference is, Google almost always creates awesome things out of the remains of old companies and services. Seesmic hasn't really done anything but allow a useful product to stagnate, and now they're going to kill it at some unspecified future date, replacing it with something that can never be a true replacement. You can't replace a free service with a paid service; it doesn't work that way.
If Seesmic takes their pricing structure in the same direction as HootSuite, though, and they only charge for certain features, I might actually give Ping a try. I have a hard time imagining a situation that would make me actually like Seesmic as a company, though.
- The paragraph was riddled with links to Seesmic.com, which I didn't copy. There was no point. [↩]
- Unlike other social networks that died, Jaiku had a dedicated following willing to preserve its contents, if not the functionality. Apparently, my "presences" are archived. [↩]
- Added later on publish date (23:20 or so) when I discovered that Shorten2Ping had failed to post this article via Ping.fm. My server's emails are working. The problem is with Ping.fm. Grr. [↩]
- Yes, I skipped copying another link to Seesmic.com. All occurrences of "Seesmic Ping" were linked except for one. I guess somebody missed it. [↩]
- And just like in the blog post, every occurrence of the phrase "Seesmic Ping" was linked to Seesmic.com. Talk about carpet-bombing links. [↩]
- If I'm not paying for Seesmic Ping, I'm certainly not shelling out for an SMS gateway to serve my one-user app. [↩]
- Ping.fm only has scheduled posting because HootSuite supports Ping.fm. It's not native. Hellotxt has native scheduling, but I haven't tested it yet. [↩]
Leaky Websites
This is my second blog post assignment for my Journalism course. As with the first, reposted here because "why not".
The New York Times' "Bits" blog published an article last Tuesday that really opened my eyes. The Center for Internet and Society at Stanford Law School released data on what information is passed between certain popular websites.
Long story short, logging in (or even trying and failing to log in) to a site can pass information about you to third parties. That information can be as innocuous (but still trackable) as a "unique identifier" generated by the site or as specific as your email address, username, and real name.
Somini Sengupta (author of the Bits blog post) says:
Take for instance these findings, released on Tuesday by computer scientists at Stanford University. If you type a wrong password into the Web site of The Wall Street Journal, it turns out that your e-mail address quietly slips out to seven unrelated Web sites. Sign on to NBC and, likewise, seven other companies can capture your e-mail address. Click on an ad on HomeDepot.com and your first name and user ID are instantly revealed to 13 other companies.
I did some digging of my own through the Microsoft® Excel® spreadsheet available from the Stanford Law School page (direct link to XLSX file) and found some interesting examples of my own.
For example, MSN.com leaks your birth year and birthdate to FBCDN.net (a domain owned by Facebook and used for content distribution). Facebook's CDN can't possibly need that information for anything but tracking. Take another case: Ask.com sends your username to Google Analytics, reCAPTCHA (owned by Google), ScorecardResearch (part of comScore, Inc.), Gigya (a company that "makes websites social"), Quantserve.com (used by Quantcast, an advertising network), IMRWorldwide.com (controlled by Nielsen), and LinkedIn.
Incredibly, The Huffington Post's website sends your username to BlogCDN.com (another CDN), BuzzFeed ("Tracks the Web's Obsessions in Real Time"), AdSonar (owned by Advertising.com; provides targeted text ads), ScorecardResearch, AOL.com (Huffington Post's owner), FBCDN.net, aolcdn.com (AOL's CDN), ATWOLA.com (stands for AOL Time Warner Online Advertising; tracks surfing habits), Facebook.com and Facebook.net, Google Analytics, IMRWorldwide.com, Quantserve.com, and HuffPost.com (used for delivering static content without cookies, ironically); your birthday to BuzzFeed and IMRWorldwide.com; and your birth year to Advertising.com and ATWOLA.com.
The point is, any information given to a website as part of the registration process or entered later while updating a profile allows third parties to do just that: profile you as a person through your behavior across countless sites. All this tracking is thanks to the triviality of circumventing the "same origin policy" of data stored in browser cookies through collaboration between sites.
A standard feature of Web browsers is sending the address of the last page visited (the "referrer") to the page being loaded. In the case of images, scripts, or other resources loaded within a page, the referrer is the page in which they are embedded. If the page displaying advertising has personal information embedded in its URL, that information is passed on to any sites whose assets are embedded in the page. This kind of information leakage can be accidental as well as deliberate. It does not typically function for sites that are encrypted (URLs beginning with https://), as most browsers disable sending referrers for secured connections.
Websites intentionally wanting to share user information might go about doing so another way, and while I had a written explanation of an example process it is sufficient to say that methods for intentionally sharing information and tracking users across domains, even in spite of user privacy choices like clearing cookies, are numerous.
When information is revealed in the URL, it's not necessarily intentional. Back in May, Symantec discovered (The Daily Mail reports) that some applications on Facebook's platform were potentially giving advertisers access to users' accounts due to app URLs including access tokens, the bits of information older Facebook apps used to identify themselves and connect to users' accounts. It was just an oversight.
Google Books and the Book Industry
I wrote this for my Journalism class at college, but figured I might as well share it here too.
The New York Times ran a story Monday about a new lawsuit filed against HathiTrust, a partnership of universities and research libraries that maintains a digital book collection on its website.
Plaintiffs in the suit include three major authors' groups: the Authors Guild, the Australian Society of Authors, and the Québec Union of Writers. Eight individual authors are also party to the filing, among them Pat Cummings, Roxana Robinson, and T.J. Stiles.
The objections raised in the suit center around the HathiTrust collection itself. "[S]even million copyright-protected books" (according to Paul Aiken, executive director of the Authors Guild, as quoted by the NYT) are available without any consent from the authors. The Authors Guild and its fellow plaintiffs say that the collection violates copyright law.
HathiTrust's collection consists of books digitized by Google, Inc. as part of the Google Books project, which has been steadily scanning books from participating university libraries across the United States.
The Google Books project has been the subject of many lawsuits over the years since work on it was begun in 2002. A few examples will help provide context:
- 2005: The Authors Guild sues Google for "plain and brazen violation of copyright law" (archived press release from AG via Archive.org)
- 2009: French court halts Google Books in France: the ruling applies only to books published in France under copyright (Los Angeles Times article)
- 2010: Several professional photographers' organizations bring a class-action suit regarding the reproduction of copyrighted images within the books scanned by Google (Mashable.com article)
The Authors Guild has been involved with this issue before. This time, the fight has been brought to an organization with a bit less might than Google.
But never mind who sued whom, for what, and when. The issue is really quite simple, and most of the lawsuits against Google Books have had little to no merit.
United States copyright law (the laws under which most Google Books lawsuits have been filed) contains a doctrine known as Fair Use. It was originally intended to protect commentary, critique, and parody of copyrighted works. However, the principles of Fair Use (Cornell University Law School Legal Information Institute):
- "the purpose and character of the use" — e.g. for commentary, critique, parody, scholarship, etc.
- "the nature of the copyrighted work" — published/unpublished, fact/fiction
- "the amount and substantiality of the portion used" — how much of the work was used, and how significant the used portion is to the work as a whole
- "the effect of the use upon the potential market" — if the use of that portion will negatively affect demand for or the value of the original work
(Thanks to Stanford University's Copyright & Fair Use information center for helping me refresh my own memory of these concepts.)
The way Google Books works is carefully designed to fit within existing copyright laws. Books in the public domain are fully accessible, with no restrictions. Copyrighted, in-print books allow whatever access the publisher has specified. For in-copyright books that do not have a publisher, Google restricts access to "snippets", which show just a few words surrounding the user's search term.
So: Whenever Google Books shows a significant portion of a book, it has permission from the publisher to do so. Without permission, Google Books displays tiny fractions of the full work in an immensely transformative manner.
Google Books falls well within Fair Use doctrine, at the very least. Displaying card catalog – type information about the book plus at most a sentence or so for each search result (I'll go down the Fair Use list):
- Is for scholarly reasons
- Uses published works
- Displays at most a few percent of the whole book
- May actually increase demand for the books featured in the results
(Parts of Lawrence Lessig's 2006 video discussion of Google Book Search came in handy for an overview of how Google Books works.)
So why are publishers and authors suing Google and HathiTrust?
As far as I can tell,[original research?] HathiTrust follows the same rules as Google Books. This makes sense, as the content is from the Google Books program.
HathiTrust's entire archive is intended for academic use. It's unclear why the various plaintiffs in this new lawsuit are suing for the removal of their books from the archive, rather than suing for better access controls. If the concern is that anyone can access the books (which they can), then restricting access to verified researchers would clear up the problem.
It's like big music, film, and television. The music industry figured out that it could simply adapt to the Internet and start offering content over the new medium, giving people an alternative to pirated copies shared through services like Napster, LimeWire, and BitTorrent. Film and television haven't yet figured that out, and I guess the book industry is still working on it too.
Finally: Google Voice Export Feature Released (sort of)
It took quite a while — more than two years since launching in March 2009—but Google Voice finally supports exporting!
I'd love to think my export format ideas post had something to do with the end product released yesterday, but I seriously doubt it.
Sort of...
Let's just say, Google Takeout isn't behaving very well. The test archive I created yesterday won't download, and I've tried both Google Chrome 13 and Mozilla Firefox 3.6. The feature isn't there yet, but I'm sure Google engineers are working on it.
I'm still happy...as soon as they make it actually work.
Polishing Minneapolis’ Wireless Civic Garden
I've done some playing around with the citywide Wi-Fi here in Minneapolis, and I must say that the range of information accessible through the Civic Garden feature (which allows even non-subscribers access to City-related sites) is impressive.
However, while I understand that the whitelist of "free" domains is limited to noncommercial properties, there are a few exceptions that should be made. Or at least, some resources should be hosted by the City or proxied for Civic Garden users.
Metro Transit's site
Visiting MetroTransit.org when online via the Civic Garden is a little weird. The home page is a lot longer than usual — actually, most pages are longer than usual — due to the absence of JavaScript libraries hosted at ajax.googleapis.com.
Because of the missing code, features that normally hide away in compact accordion stacks or appear when the mouse is moved over them are left in the open. One of them even steals focus when the page has loaded, making the view jump most of the way down the page. It took me a while to figure out why the page was scrolling by itself.
The navigation is broken for all but the top-level sections, because the missing code runs the drop-down menus that allow deeper browsing into the site. On the front page, a series of five images depicting the various Metro Transit services1 that is normally an automatic slideshow with mouse interaction expands to five panes stacked down the page — and the links embedded in them don't work.
On the right side of the page, a clutter of tools appears where there is normally a neat stack of expandable options. One of them is the culprit for the page-scrolling I observed, and it gets annoying after a few pageviews to have to scroll to the top of each new page loaded. (Somewhere in the page's code, a JavaScript snippet that doesn't rely on one of the missing libraries is placing the caret2 in a text input near the bottom of the page, and most browsers automatically scroll to make such a "focused" element visible.
Glancing at the page's source, I notice immediately which files must be the problem. Two <script> tags include the jQuery and jQuery UI libraries from Google's CDN. This practice usually improves speed, since the likelihood of the files already being cached by a visitor's browser is increasing as more and more sites start using these Google-hosted versions of popular JavaScript libraries instead of their own copies — but in this case, it's causing breakage for a subset of users. Google's service is not whitelisted as part of the Civic Garden.
Solutions
Two solutions present themselves, and they are both simple.
Ideally, Metro Transit would pass a request up the chain for ajax.googleapis.com to be whitelisted. Not only would doing so solve the problem for their site, but it would also allow any other Civic Garden website to take advantage of Google-hosted libraries without any further work from either Civic Garden administrators or individual site maintainers.
This first solution also has the potential to save bandwidth usage, since Google sends aggressive caching instructions along with the files hosted on its CDN. More Civic Garden sites using libraries hosted by Google would result in negligible increases in data transfer, because the same files would be downloaded once and then cached for use by any site requesting them. Saving bandwidth on the free Civic Garden would open up more of the pipe for paying subscribers — an outcome with which U.S. Internet would no doubt be pleased.
Alternatively, Metro Transit could add the core jQuery and jQuery UI files to the pre-existing /ClientScript/ directory, which I can see already contains plugins to those libraries, the Cufón library,3 and a font file for Cufón to use, among other things.
This alternate solution is a good fallback if the higher powers in control of the whitelist refuse a request to allow access to ajax.googleapis.com. It only solves the problem for Metro Transit's website, but it would fix the issues discussed above.
A third, much more complicated, option is described below. Obviously, if it were applied to the Metro Transit problem, ajax.googleapis.com would be used where www.google.com is in those examples. While it would also work, it is unnecessarily complicated for the scope of the problem facing Metro Transit's website, and that is why I don't count it as a solution here.
Resolution
Some time ago I contacted Metro Transit using the feedback form on their site to notify them about the breakage and propose (in brief) my solutions. I received a response just a few days ago, with the welcome news that they will be fixing the problem in the next site update by hosting the JavaScript files themselves. Not the ideal solution, but definitely the easier of the two possibilities I could think of.
Way to go, Metro Transit! You've beaten me to the punch. Not that it's hard to do these days, what with my posting frequency and all...
The City's site
Located at www.ci.minneapolis.mn.us, the Official Website of the City of Minneapolis has a wealth of information on everything from regulations to recycling and more. It allows access to City Council agendas, a list of what can and cannot be left for the recycling program, and countless other unexciting but eminently useful bits of information.
The main problem with the City's site as viewed through the Civic Garden access is that it is impossible to search. Submitting a query through the search box at the top of any page leads the user to a page that says "Search the Minneapolis Web Site" above another (empty) search box. And pretty much ends there.
It's great to see that the City (or at least its Web developers) is embracing modern Web services like Google's Custom Search Engine, but all the resources required to fetch and display results come from www.google.com, a domain blocked when using Civic Garden access.
Not an Easy Problem
Solving this problem is a bit more difficult. Whitelisting www.google.com is out of the question, as that would also allow free access to many of Google's consumer-oriented services including its trademark search engine, calendar, feed reader, and so on.4 Unfortunately there are no easy solutions here. Hosting the JavaScript files doesn't solve the problem because those files in turn load other files whose locations are embedded in the code.
Implementing some sort of proxy would seem to be a solution, but there's still the matter of hard-coded resource locations. Nothing returned by Google would request files via a City-controlled proxy, no matter how sophisticated the proxy.
There's also the matter of load. Obviously any solution involving the use of City hosting services should be restricted to those users who need it — that is, Civic Garden users — to avoid unnecessary load on the servers. But there might not be a way of separating the "needs" from the rest of the crowd in a way that would allow the server to send different pages to those who need them.
Best Idea Forward
Without knowing more about the network architecture, I can come up with only one possible solution.
The flow would go something like this:
- User loads search page, and browser requests resources from Google
- U.S. Internet network5 receives and recognizes requests destined for www.google.com
- Network scans a list of allowed request patterns to www.google.com; such a list allows only the resources needed for Google Custom Search
- User's browser receives the needed resources
- Google's Custom Search code sends its requests to retrieve results, which are filtered through the same mechanism at the network level and allowed to return data to the user, completing the search
It's a rough description, but generally all that's needed is an extension of the domain-based filtering to enable filtering on request patterns — that is, the contents of the GET line in the request headers.
If the requested hostname matches www.google.com, that request is sent to a second filtering routine that performs pattern analysis (via regular expressions or what-have-you) on the requested path. /jsapi and /coop/cse/* can get through and return those resources to the user; /reader/view/ and /webhp?q=denied can't, and redirect to the subscription login page (the current behavior for all non – Civic Garden sites).
Implementing this solution would require analysis of all the possible requests generated by Google Custom Search, though Google might have available (or be willing to provide) a reference of how Custom Search works. Once put in place the filtering expansion would enable any site in the Civic Garden to use the service and have it work for everyone, without changing anything else. It might also require changes to the network equipment that runs the citywide wireless service, but such upgrades would prove useful in short order as more City services were made available to Civic Garden users thanks to the accessibility of search. (See next section)
Other Applications
While the main problem with the City's site as accessed via the Civic Garden is the lack of search, there are other issues.
Forms, for instance, seem to mostly be hosted on external sites that are not included in the Garden whitelist. Much information is given about the services these forms can be used to obtain (such as snow emergency notifications by telephone or email), but filling out the forms is impossible.
A complete audit of all external resources called by the City's site (and in general, all Civic Garden sites) could provide a list of domain names and resource paths for whitelisting. The above-described filtering system could be extended with the contents of such a list so specific pages from commercial sites used on City properties could be made available, while still blocking effectively all commercial traffic from the Civic Garden.
Enabling access to third-party resources that are currently blocked, despite being included in Civic Garden properties, would provide an even greater return on the investments of time and (possibly) money in the upgrades of network hardware and firmware that would likely be necessary to support such a filtering system.
I emailed the City about this and was notified several days later that my message had been forwarded to their IT department. At that time I hadn't come up with this new filtering idea, so I've contacted them again with a link to this post. Maybe they'll read it, maybe not; but it's been a nice thought experiment.
- Which are: Bus, light rail, Northstar commuter train, bicycle accommodations, and Rideshare (car or van pools). [↩]
- caret: the blinking line or box often used to enter text on a computer [↩]
- Cufón replaces specified text elements with graphics rendered dynamically by the browser to provide more control over typography than the current lowest-common-denominator browser-native technologies. [↩]
- I for one would love it if the City had Google services whitelisted so I could check my email and calendar from pretty much anywhere for free, but I can understand the need to block commercial sites on a publicly funded network. [↩]
- U.S. Internet is the local ISP that was awarded the contract to build and run the citywide wireless service. [↩]
Reflection Squared: On Clifford Stoll’s “High Tech Heretic”
The other day, I was browsing the computer shelves at a local Border's book store. I came across Cliff Stoll's acclaimed book, The Cuckoo's Egg. My dad's recommended the story to me in the past, and the premise was intriguing. After all, who wouldn't want to read a non-fiction account of cyber espionage that reads like a top fiction mystery? I picked up the book and proceeded to spend the next two hours engrossed, reading right through the soft muttering and louder tapping of the woman in the chair beside me.
Of course, the time to depart arrived and I had to stop. Still, I read about 25% of the book in one sitting. I replaced the book on the shelf, noting to look for it at the library and/or add it to my wish list. (Even if I wanted to buy it, I wasn't exactly in a position to do so.)
The next day, en route to the upstairs computer lab, I checked the public library catalog. The Cuckoo's Egg wasn't in stock, and was checked out until the 21st of April, but I noticed that one of Stoll's other books was: High Tech Heretic: Why Computers Don't Belong in the Classroom and Other Reflections by a Computer Contrarian. On impulse, I checked the book out.
What I found inside, later, was intriguing. My parents have been skeptical of computers for a while. Though my dad uses them for his business, and my mom is warming up to them after years of asking me why I find them so interesting,1 there's still a big disconnect between us.2 I've vaguely known the reasoning behind their conclusions for years, but High Tech Heretic has shed some light on the details — and not monitor glow.
Programmed Instruction
Despite my parents' computer skepticism, I took my entire high school education online. I believe it was a good experience, though not for the reasons one might expect. It's not that I necessarily learned more than I would have in a conventional school — though I probably did, since the online coursework better fit my learning style — but rather that I spent a good chunk of my "school" time correcting the course material. Lazy QA teams had left the text, quizzes, and tests riddled with little errors. Through my teachers, I sent corrections, and my correction work earned back more than a few points that were wrongfully denied me in nearly every course — though I never got so much as a "Thank you" from the course distributors. (A rare few courses were bereft of glitches. I treasured them, because I didn't have to keep second-guessing everything.)
What was interesting about some of the corrections, though, was that sometimes it was just a matter of input formats. Most of the graded tests were multiple-choice, but many of the in-text "Self-Check" quizzes featured free-text inputs. Such quizzes were graded by JavaScript code, to give students an idea of how well they understood the material. But some of them had vague or quirky requirements about how answers were entered, and some of the quirky expectations made by the programmers resulted in points lost by students.
Stoll addresses the issue on page 16, in reference to B. F. Skinner's experiments with programmed instruction in the 1950s. Skinner's approach was nothing new, really — it mimicked a popular learning method preached by many educators then and now: repeat a topic until the student demonstrates understanding. Skinner's machines rewarded students for correct answers with further exploration of the topic, while incorrect answers led to review.3 However:
…programmed instruction flopped. The machine forced kids to regurgitate whatever answers the programmer wanted. There was no place for innovation, creativity, whimsy, or improvisation.
This sounds very familiar. Almost too familiar. The quizzes in my online coursework sometimes had bizarre expectations for what was to be typed into the text boxes. I once had a quiz (thankfully not graded) that balked at accepting a floating-point number (0.17 or something) with the leading zero; the expected input was .17 and too bad if you've been trained to put in the leading zero. The programmers were treating all text box inputs as strings, rather than parsing the values into numbers when appropriate. We all know that programmers are lazy, but certain kinds of laziness are inexcusable.
Skinner's ideas persisted, even into the years of my childhood. I had plenty of educational computer games in my youth, and maybe they did help teach me. Very little of what I know comes from conventional schooling — I know that much. Reading, writing, arithmetic, higher math, typing, (amateur) programming — all of it I learned outside the classroom. Reader Rabbit, Treasure Math Storm, and Edmark's Mighty Math software deserve more credit for my education than any school classroom I ever set foot in. Forgive me if it sounds like bragging, but I could read and write circles around most of my traditionally-educated friends all through my schooling. Kumon and my learning-friendly home environment can take the credit for my perfect score on the ACT's English section, not the school system.
Stoll also brings up computers in the classroom repeatedly. One great example is the replacement of science labs with computer programs. My local high school has a chemistry/physics lab, but an unscientific sample of the classes taught in the room shows much greater use of the computers for experimentation, rather than the lab equipment.
Learning the Tools, Not the Trades
Stoll also brings up the issue of learning how to use specific tools rather than the concepts underlying them. Chiefly discussed in the chapter "Calculating Against Calculators", the arguments focus on numerical fields; however, the thread is present practically from the beginning and applied to all subjects.
Through school, students are handed calculators in math class. They're trained to punch in the numbers and trust the calculator to come up with the right answer. Now, common sense dictates that one should always be able to estimate, so as to be able to catch errors in a calculation. In theory, students are taught to mentally check the calculator's results; in practice, assignments are turned in with answers stating that a radio tower is a fraction of a millimeter tall.
On page 85, the University of Illinois is used as an example. The school developed a calculus course centered on the Mathematica software. As such, the students learned how to integrate functions using Mathematica, rather than learning how to integrate. Students trained to use certain software programs for problem-solving often didn't know what to do when the electronic part of the equation (sorry) was removed.
In my math classes, I can remember very few times when I wasn't encouraged to use a calculator. A TI graphing calculator was a requirement for high school math classes, but I got through four years of online instruction with a photoelectrically-powered scientific calculator, used mostly for checking myself and dealing with nasty decimals. (I was fine graphic linear equations on graphing paper, but I did cave in and download a software program to do the parabolic and asymptotic functions for me.)
Learning tools at the expense of the underlying concepts isn't just limited to math. From my own experience, as well as friends', I've seen courses teach how to use a particular software program to solve a problem, without explaining what the program does. Modern English course requirements for electronically-submitted papers just begs for students to rely on spell-checking software. Many of my fellow students routinely misspelled even the most common and simple words. I can't help but blame Microsoft Word; it's the de facto standard for word processing these days, and defaults to automatically correcting a huge list of common misspellings so sometimes the user doesn't even know he's made a mistake. That's a bad idea for software used in education.
Systems Design Philosophy
Perhaps one of the best points made in the book is taken from David Gelernter's thesis: "Technology's most important obligation is to get out of the way." This point, from page 139, illustrates the basic purpose of machinery: making life easier. Bad design and useless features remove the helpful aspect of technology and replace it with nuisance.
Ah, PowerPoint
Following chapters on, among other things, the wiring of libraries and the planned obsolescence of computer systems, an entire chapter is devoted to PowerPoint and its fellow presentation software products. I thought the best part of this chapter was the section discussing the use of presentations in schools.
With my online learning experience, I was thankfully spared most of the PowerPoint junk that has made its way into the school curriculum. However, I had teachers in the offline world as well, and a few of them used PowerPoint to disastrous effect.
One such teacher followed the model for meetings presented earlier in the chapter: Notes for the students, slides on the screen; the lectures consisted of reading the slides aloud, with zero additional information presented in the spoken words. I was always bored to tears in that class. It was ironic that the course title was "Public Speaking", since such a class should be teaching students how to keep an audience's attention instead of how to make the audience yawn.
Another teacher — this was in a public school — taught her AP U.S. Government course using PowerPoint. She read from the slides, often rushing through and/or skipping slides for time (no worries, the slides were available on her personal Web page for study at home). Her habit of putting paragraphs on the slides wasn't exactly prime PowerPoint use, but at least she added extra tidbits to her lectures that weren't in the textbook or on the screen.
I should also note that part of that Government class was a group presentation project, on which I got a good grade just by going up and reading a few of the several slides produced by my group while I was sick. That isn't a complaint — I like good grades just as much as the next guy — but I didn't really have any input whatsoever on the project save for a few grammatical corrections. (I won't get into how my classmates made it difficult for me to contribute, even though I was perfectly willing to do my share.4)
I present these examples mainly to illustrate my own personal experience with the problems Cliff mentions on pages 182 – 183. (It's interesting that his main classroom example also involves a social studies teacher.) I'm sure educators would be quick to defend the growing use of PowerPoint in schools by citing technological familiarity for future job use, same as they would for school Internet connections (which are useful, but often inadequately restricted).
Dated Material?
I did have the thought throughout the book, however, that perhaps some of Stoll's opinions would be quite different if written today. In particular, page 189's assertion that professional editors and journalists just don't exist on the Internet is no longer true. That assertion is a fundamental point in several arguments following — arguments that would probably be different (if only slightly) if written from a 2010 perspective instead of a 1999 perspective.
Similarly, page 191 asserts that search engines don't understand concepts and ideas, only words. Today's indexing engines aren't perfect, but great strides have been made in machine understanding of language. Just look at services like Aardvark. (This is, of course, just a tiny subset of the possible examples I could have pulled from the book.)
Of course some things — unfortunately — never seem to change. I stupidly didn't note the location of it, but somewhere in the latter part of the book Stoll laments that search engines rely on correct spelling to find information. Spelling is a skill seldom taught or learned in today's world (it seems), and we rely more than ever on spell-checkers. Many services offer their own (see Gmail & Google Docs as examples) in the event that the user's browser doesn't have one already built in. Search engines have been trained to recognize our mistakes in queries (à la Google's classic "Did you mean?" lines) and sometimes I think they also detect mistakes in pages they index.
Overall
High-Tech Heretic contains a good many well-placed warnings, and I very much appreciate Stoll's opinions on the replacement of human and paper resources with technology. However, I hope that his later writings are better edited. This book has quite good spelling (good, since he brought up that issue) but the grammar is lacking in a few spots; I found a decent number of omitted or misplaced words.
Nitpicking aside, the message of the book is clear and appreciated. Technology has a place, and we shouldn't let it get out of the corner we've set aside for it.
Update (05/04): Corrected missing markup that caused most of the text to appear as a giant footnote. Proofreading failure on my part; sorry!
- She's begun asking me about websites and such: Hosting recommendations, platform suggestions, that sort of thing. It's kind of cool that she's interested now. [↩]
- I used to go to my dad with questions about the computer. Now, he comes to me with his questions and I use search engines to find answers for my own. [↩]
- I had several experiences with this type of learning, including both online (with Stanford's EPGY program) and off (with Kumon, a Japanese-originated curriculum in math and reading). [↩]
- Schools seem to use group projects a lot without teaching students how to collaborate, kind of like a lot of theatre classes tell the actors to project without getting into the mechanics of doing so. [↩]
tr.im: An Exercise in How Not to Run a Service
It recently came to my attention that tr.im has decided to stop accepting new URLs shortened through the website and asked developers to remove tr.im functionality from their applications, and plans to shut down the redirection service in a year or two. I went there to shorten an address on Tuesday but came upon this page instead:
Ever since discovering the service about two years ago, I have shortened almost every URL I post to Twitter, Facebook, and several other such sites through tr.im. That will have to stop, apparently, because those addresses will no longer work in the not-too-distant future. It is unfortunate that nothing can be done about the millions of tr.im links that have already been flung to all corners of the Web.
Apparently, the August 2009 announcement/scare (see Mashable's coverage) should have been taken more seriously—a lot more seriously. Following that little episode, the overwhelming response from users convinced Nambu Networks (tr.im's developer, whose main products are Twitter apps) to abort the planned shutdown. I, and a lot of other Internet users, thought all was well.1 Crisis seemed to have been averted. Now this.
Mashable, in the article from last August, stated optimistically that someone would probably buy the service before the planned hard shutdown sometime after December 31, 2009. Obviously that hasn't happened, or the service wouldn't be shutting down. But there has to be a better solution than pulling the plug, even if that doesn't happen until 2011 or 2012.
I can accept that Nambu administrators have had to deal with a lot of spam links being generated using their service, but it puzzles me that the spam would lead hosting providers to threaten termination of the site. After all, Nambu is not responsible for the links its users submit, nor the contents on the other end of its redirections — but that's far beyond my expertise.
However I must wonder: Instead of just giving up, why not develop better spam-fighting algorithms? Digg, Reddit, any site that accepts user-submitted links — even Facebook and Twitter — have countless spammers fighting to get their links in front of millions of users, and they all do a pretty good job of keeping it off the site algorithmically, with no human intervention. I don't see bit.ly giving up its fight against spam, or is.gd, TinyURL, SnipURL, or any of the other established shortening services. They must have spam link submissions too, but they get by. None of the other shortening services I've come across in the past few years have ever threatened to disappear, for spam volume or any other reason — and I've looked at a lot of them. Yet tr.im has done so now twice in less than nine months, and it looks like this time may be for good.
A lot can happen between now and when Nambu decides to finally pull the plug on tr.im's redirection service, of course. Perhaps a buyer will surface. (Then again, offers were made in August, only to be turned down because Nambu didn't feel it could trust the potential buyers.) Perhaps Nambu will change its mind — again. Heck, I'd buy the service and run it myself if I had the funds. Anything's better than breaking millions of links across the Internet; shutting down a service like tr.im will even affect email archives, since shortened URLs make their way into emails all the time.
No matter what happens, I'm going to follow the old saying, "Fool me once, shame on you. Fool me twice, shame on me." I stayed loyal to tr.im and Nambu after they threatened to make my digital world fall apart last summer. I continued to use their awesome service because I loved it — the name, the interface, everything — and they've turned around and made the same threat, only stronger. I cannot possibly ignore this decision, what amounts to pulling out the knife they stabbed in all of their users' backs in August and driving it back in an inch away. It's absolutely infuriating. SnipURL (and snurl.com, sn.im, cl.lk, and snipr.com — the service maintains five different options), here I come.
tr.im, you've been a great example. Nambu, I sure as hell won't be buying any of your software products, ever. You better give some serious thought to giving us users a way to keep the redirections working, or at least a way to export the redirections we've created so we can go through and change or annotate whatever old content we can to keep the links from breaking, because that's the big reason I'm angry. If you simply shut down, you will be intentionally breaking a large percentage of the Web.
Is this the future of millions of tr.im URLs all over the Internet?
- It is, however, true that many users vowed to never again use tr.im after that episode. I wasn't one of them, but as it turned out that was a mistake. [↩]
reMAP: IMAP reConceptualized
Gabor Cselle, the founder of reMail, recently posted an idea for replacing the IMAP email protocol with something with which working would be easier. The proposed name? reMAP, short for reimagined Mail Access Protocol.
He calls for a RESTful design that among other things would globalize message identifiers (rather than changing them the instant a message is moved to a new folder), replace folders with labels (a la Gmail), require the server to handle email search indexes, and make conversations the basic unit of email (instead of individual messages). reMAP would also make handling MIME messages unnecessary; the client could simply call the server with a request for text or HTML message representations without having to deal with parsing the MIME format itself.
I personally am in agreement with his entire proposal. The experiences I've had with IMAP in the past have highlighted shortcomings in a standard that was drafted over 15 years ago. Email has changed a great deal since then, but IMAP has not been revised to accommodate the enhancements made by newer clients and services like Gmail.
If IMAP is to be improved, it's probably appropriate to just completely replace it with something new. If the new system can translate IMAP commands into the equivalent operations in its own protocol, that's even better, because then servers can be upgraded without worries of breaking compatibility with older clients or the need to run server applications for IMAP and reMAP side by side.
There's plenty of discussion going on at the original post and on Hacker News. If, however, you would like to say something here, please don't hesitate.
As a side note, I see that Gabor is using Blogger's FTP publishing option, which will be going away soon. I hope the link will still work when he has to move.
“Houdini” plugin for WordPress is no magician
I've seen some pretty absurd WordPress plugins show up in the Plugins dashboard widget on this site, but the recently-released "Houdini" takes the cake so far. It claims to prevent spammers from copying the contents of any post or page upon which the [houdini] shortcode is placed.
The fact is the internet is open can lead to theft especially to content stealing and plagiarism.
Until now, there was very little to discourage and deter this serious crime. Yes content theft and plagarism is a crime in some jurisdictions.
You cannot rely on others or the authorities to continue to police the internet as they do not have enough resources. You need to protect your content and deter this theft.
The basic form of content theft is to copy and paste your content to another medium.
Well Houdini, prevents this using a little known special algorithm that prevents copying by making the selected text that is targeted by the perps to be copied, to disappear! Yes disappear!!! The only way to recover is to reload the page in the web browser. If they try again, the content disappears again. As long as they keep trying to select and copy your content, the content will disappear before they can get a chance to execute the copy command!
After a few unsuccessful attempts, the theives will move on to a easier target.
Your safe!
So what can we glean from this PHK Corporation plugin's description, other than the fact that the author has poor English skills? We can most definitely conclude that phkcorp2005 has no understanding of how most copying of Internet content is carried out. As I and others have pointed out many times over in blog and forum posts, copying is usually not done by a person using a mouse to cut and paste, but rather by automated computer programs called scrapers. (For the uninitiated: See these two Wikipedia articles.)
What is left out of that messy, error-riddled description is the word "JavaScript". It is by no means the only word or phrase that should be inserted, but it is the most important. That fifth "paragraph" (the formatting is also very poor) should say "special JavaScript algorithm", which is synonymous in this case with "useless JavaScript algorithm". All it does is wait for the user to try to select text in the browser and clear the selection if any is made. Besides, any copy-protection scheme based upon JavaScript is inherently useless by virtue of the fact that it doesn't do anything to prevent copying. There are tons of ways to get around it. Disabling JavaScript, for example (as mentioned below).
For example, take hatkirby's rant. I quote from that post the list of circumvention techniques below:
- Go old fashioned and turn off JavaScript. Yep, the script is rendered useless.
- More advanced content thieves likely don't just go around to random blogs and copy/paste off of them. They write screen scrapers, small programs that visit sites and download specific parts of the site. As these do not render pages and simply download from them, the script isn't even seen by the scraper.
- Due to the nature of the Internet, anyone, and I mean anyone, can see the source code of a website. It's done differently in different web browsers, but it's always pathetically easy and, as it simply shows HTML code instead of parsing anything, no scripts are run.
- RSS. Syndication feeds are normally viewed in feed readers with little to no JavaScript interpreter. Script bypassed.
- There's this cool little button on most keyboards that says "Print Screen". Even on the keyboards that don't have it, there's usually a key combination that achieves the same effect. It takes a picture of whatever's on the screen. No selection occurs and yet the thief has a copy of your article. They do, however, have to retype it, so this keeps the lazy thieves out.
That's just a smattering of ways to get around the JavaScript inserted by Houdini.
In the face of all the arguments presented, the plugin's author has insisted that the purpose of Houdini is not to "prevent" copying, but to "deter" copying. I don't think that statement holds any weight whatsoever. It still depends upon the copying being performed in a JavaScript-enabled browser by a human.
There's also the matter of just how absurd copy-protection of any kind is on the Internet. Every single document or file anywhere on the Internet must be copied in order for the user-agent (usually a browser in the case of human interaction) to retrieve and display or otherwise make use of the content. This is why it's quite simple for any user to just view the source code of a page. It has to be copied in order to display the content.
Also mentioned in the first (started, chronologically) forum thread is the ability of JavaScript to disable the browser's context menu and thus the "View source" option. That's just as useless as the selection-clearing code, and actually more so because many modern browsers allow specific JavaScript capabilities to be disabled — capabilities like removing or replacing the context menu — as an alternative to disabling all JavaScript. The "View source" option is also present in other places — places such as the browser toolbar's "View" or "Tools" menu — which JavaScript code cannot modify even in the most permissive environment.
Legitimate quoting must also be considered. There are a million and one reasons why someone might legitimately want to copy a few sentences of a blog post. Maybe they like it enough to post a quote to Twitter or Facebook, or perhaps they want to comment on it in a blog post of their own. Content theft is a big problem, but the old methods of periodically searching for and reporting content stolen from one's site are infinitely preferable to this plugin's ineffective method.
Finally, why require the use of a shortcode? Why not just add the script globally to all content pages and forget that stupid "This page is copy protected" header?
At most, Houdini has the ability to add a superfluous <h5> tag to the page and annoy legitimate users with an obnoxious script while doing absolutely nothing to thwart real content thieves. I wonder if WordPress Extend would consider removing this laughable plugin from the directory... Of course, we bloggers would then be denied this ripe opportunity to satirize this particular piece of code.









