In building upon my earlier post about “How to mine for content in an existing business”; I want to highlight some of the trials and obstacles that I had to overcome.
First and foremost is the evolution of data. I speak about Document files and those pesky old word processors that had their heyday in the Eighties and early Nineties. PFS write, WordPerfect, Ami, and others. Some of these packages did a good job and I actually liked PFS write. WordPerfect was a beast and I spent many hours in college working with the insert codes to get pages to print as required by each professor.
It is probably this background in the older versions of software that has allowed me to scavenge such large amounts of untapped content.
In my quest to repurpose data I have had to run the gauntlet of 5.25 floppies, 3.5 hard floppies, Zip drives from the 90’s, and mini-cartridge tape drives.
First thing I realized is that I had compatibility issues with old and new versions of software. To get around this I sorted all my data formats by date and then by software program. I then looked in the back closets, fire safes, and deposit boxes for old versions of the install software. It is amazing what you can find if you are willing to dig in the back of cabinets and get your hands dirty. If I could not determine the age of a disk or storage device by its label I simply set it aside until I had a machine parted up that would read the data source.
Next I located an old desktop machine that everyone considered too out of date to be useful. I did set limits on this and parted up one that would run Windows XP; I wanted to be able to run DOS based programs as well as Windows 3.1. Also having a thumb drive was extremely important for getting the old data to my everyday machine. Next I scavenged drives for tape, zip, and floppies from old cases and now had a true masterpiece of a machine. The drivers proved to be a bit problematic but the web is the best junk closet of all times. The 5.25 floppy drive was covered in dust and I had real doubts, but it worked.
Now I choose an install date that would work with the files creation date. My choice was to go with 1995. This worked because DOS based programs were still in use and Windows 3.1 was just starting to convert to Windows 95. XP allowed me to work efficiently up into the early 2000 as long as I did not tax the machine resources too much. It also let me work back into the late 80’s. I then installed old versions of the CorelDraw and PageMaker that were released on or near this date.
Being I had to deal with Corel Draw files for images and PageMaker for the text it was necessary to strip for graphics and text. Corel proved to be the most difficult.
At the time of creation all literature was saved in the format in which it was used. PageMaker was setup as the final presentation package and CorelDraw had all its image files saved as CDR with an EPS link to the PageMaker document. Unfortunately, once I was able to get the PageMaker document open I could not export or convert the EPS file with enough image quality to post to the web. This required me to open every image in CorelDraw 5.0 and then export. Another step added to the process but I now had raw images in a form I could use.
Once PageMaker was open I resorted back to good old Notepad. I spent a lot of time cutting and pasting text from PageMaker into Notepad but in the long run it was much faster than trying to retype or scan and reedit the copy. Notepad also cleaned all the text back to raw text format in one easy step. I decided very quickly it was better to start with clean text and reformat than try to preserve already existing formatting.
Now that I had the text and images I made certain to keep a copy of the text in raw text form. This will hopefully allow anyone 15 years down the road to re-access the files. As for the GIFs and JEPGs, time will tell. I tend to save in PNG format now but will have to take the chance that these formats stay around awhile.
So why go to all the trouble to get this old data from the vault?
Content is the driving force for any website and having hundreds of old articles containing industry keywords was a gold mine. The company is now able to supplement their blog environment for nearly two years with this content. It represents thousands of dollars that have already been spent and with minor edits will be useful for years to come.
It also lends itself to providing education for how the company’s industry has evolved. For new engineers entering the market they will have resource documents for already installed industrial products and be able to understand the transition from old to new.
John Wilkerson is a Marketing/Sales Professional specializing in online branding, ecommerce sites, blogging, email advertising, content creation, print media, and direct mail. Follow @johnwilkerson