Six years after announcing plans to archive Twitter, the Library of Congress continues to struggle in this unlikely partnership. As of today, there’s still no tweet archive and no set launch date. Instead, unprocessed tweets are saved to a server, collected but uncatalogued, unusable to researchers looking to mine it for information.
One unexpected obstacle is the rise of Twitter data. In 2010, Twitter posted an estimated 55 million tweets a day. That number jumped to 140 million in 2011 and skyrocketed to 500 million only a year later in 2012. Tweets became larger, loaded with photos and video, increasing the size of the Library of Congress’s daily downloads.
Feeds of the raw data are available through Twitter—for a price. Without the funds to pay, researchers are struggling to access the necessary data and are relying on the distant hope that the Library of Congress will get its tweet archive off the ground. The Library of Congress maintains that the tweet archive is a priority and is trying to adapt to challenges in cataloging this vast data, but only time will tell.
— Crystal Chen, LIS-653-02