Help yourself to a little bit of freeware by downloading and unzipping the PageTranslator zip file 🙂
What does this software do and why would I want it?
We needed a way of extracting page information from a website (Including meta tags and the main information on a page) and then translating that information in to many languages.
We encountered a number of issues with the first being – How do we just extract the information pertaining to the page and not all the header information? The solution to that was by leveraging the existing nuget package BoilerPipe for .NET. This package essentially parses the website page and removes any clutter surrounding the main body of text such as navigation bars and advertisements etc.
Next was the challenge of translating the page information. Enter Amazon Translate which is incidentally allows translation of two million characters for free per month with the AWS Free Tier. If you wish to use the translation feature of this software then you will need to create an account and then create an AWSAccessKey and also an AWSSecretKey.
Enter these two keys as the login and password in the Page Translator software and press the ‘Save’ button (Not this is optional, if you do not need the translation services then there is no need for any AWS Keys).
Running the software
First unzip the attached file and then double click on the file called PageTranslator
From there, enter in the top url the page where you would like to download information from and then press the Download button
If the page was able to be parsed information pertaining to the page will be displayed as shown below
If you would like to translate the software ensure that you have entered the AWS information as previously specified and then check the Translate Text checkbox and select the translation language as shown below and press the Download button
That’s all there is to it and if you like this then feel free to purchase one of our retro gamer tee-shirts