It’s easy to forget that all internet data has to be physically stored somewhere. Despite its name, “the Cloud” exists on servers, not in the sky. When you consider the massive amount of data that comprises the internet, you can begin to understand just how much time and money is invested in storage technology. Usenet is no exception. The messages, articles, and binaries that you see on Usenet have to be stored on servers. Retention is how we quantify that data storage.
What Is Retention?
Usenet is a massive community wherein users exchange large amounts of information constantly. Therefore, it is almost impossible to accurately quantify Usenet data in terms bytes. Instead of bytes, we measure data as a unit of time called retention. Retention is the length of time a post will be available on Usenet. It is typically measured in days.
History of Retention
Retention was Usenet’s answer to the issue of data storage. Retention of your data costs money. Space, energy, cooling, and maintenance all make your data accessible. So, to keep Usenet affordable and accessible, Usenet providers decided to limit the amount of time they would host data. This was the most logical way to ration space on servers. An alternative would have been to keep the most read or downloaded posts. This would get complicated, though. In that scenario, the most recent posts would always have the lowest views and downloads. Thus, getting rid of the oldest posts is the fairest and easiest way to go.
The issue of data storage is largely invisible to the average internet and Usenet user. Some of this is by design. Servers are concentrated in clusters that are often located in remote places where energy and space are cheap. Servers can range in size from Network-Attached Storage devices to massive data centers that have their own independent power and cooling systems.
Redundant storage and power guarantees that the hosted data is constantly available at a moment’s notice. Ensuring fast access to posts requested by users is critical. To that end, servers prioritize newer data and more commonly requested data over older data and less commonly requested data.
The problem of data retention is not unique to Usenet, but it is treated a little differently. Google offers 15 GB of space on Google Drive, but extra space costs more. With Usenet, there are no limitations on the size of posts, but it will eventually be erased. At least theoretically, that is.
Binary vs. Text Retention
There is a great discrepancy in the size of binary and text posts. Binary posts, of course, are much larger than text posts. So, storing text posts requires far less space than storing binary posts.
Many Usenet providers deceptively advertise their retention rate in terms of their text retention. This is because it is usually higher than their binary retention. Oftentimes, these providers outsource their storage and it is far more cost efficient to store text posts than it is to store binary posts. At Newshosting, though, our retention is the same for both binary and text posts. This is because we own and operate our own server farms.
Sending data to intermediate storage before it is used is known as spooling. It comes from the root word “spool”, which is actually an acronym for simultaneous peripheral orations online. The use of spooling dates to early computer days, but for retention purposes it is quite significant.
Traditional retention requires all posts to be deleted after a certain amount of time has passed. Spooling, however, allows data to be stored indefinitely. It is useful to think of spooling like buffering or queueing. Posts are storage on intermediate servers. When someone accesses a post stored on one of those servers, it is retrieved for use.
On November 1, 2016, Newshosting became the first Usenet provider to reach 3,000 days retention. We haven’t looked back since. With each passing day, we add another day to our retention, and we have no intention to stop any time soon.
Newshosting began spooling in 2008. Before 2008, all providers would limit their retention and delete all older posts. If a provider claimed 1,000 days retention, for example, a post would be deleted after once it reached 1,000 days. At Newshosting, though, posts since we began spooling in 2008 remain available.
All in all, retention is another aspect of Usenet that makes it distinct. We tend to think of storage in terms of the size of files, so defining storage by a length of time can seem strange to newcomers. With such an active, widespread community, however, Usenet faced unique challenges when it came to data storage. The solution was retention. Newshosting’s use of spooling altered the landscape of Usenet retention for good. Now, retention is constantly expanding, taking Usenet to new heights are preserving all of the great articles from long ago.