2011年5月19日星期四

Data science toolkit - address, coordinates, text, file, IP address, people names

Welcome to the Data Science Toolkit

Steal this server!

Grab this entire site as a free, self-contained, ready-to-run VM

Independence - Never worry about the provider going offline, or charging once you're hooked.

Security - Run on your intranet, so customer information stays within the firewall.

Scalability - No API limits. Run a cluster of as many instances as you need.

Street Address to Coordinates

API: /street2coordinates
Street Address to Location calculates the latitude/longitude coordinates for a postal address.
Currently restricted to the US and UK.
Try it for yourself. Copy and paste some addresses into the box below to see what locations it finds.

File to Text

API: /file2text
Converts PDFs, Word Documents, Excel Spreadsheets to text.
Recovers text from JPEG, PNG or TIFF images of scanned documents.
Try it for yourself. Upload a file to see what text it finds.

Coordinates to Political Areas

API: /coordinates2politics
Returns the country, region, state, county, constituencies and neighborhood a point is inside.
Try it for yourself. Copy and paste some coordinates into the box below to see what it finds.

Geodict

API: /v1/document
Geodict pulls country, city and region names from unstructured English text, and returns their coordinates.
It emulates the interface to Yahoo's Placemaker , so switching should just mean changing 'http://wherein.yahooapis.com/' to 'http://www.datasciencetoolkit.org/' in your current code.
Try it for yourself. Copy and paste some text into the box below to see what locations it finds.
Extract Locations

IP Address to Coordinates

API: /ip2coordinates
IP Address to Location calculates country, state, city and latitude/longitude coordinates for IP addresses.
Try it for yourself. Copy and paste some IP addresses into the box below to see what locations it finds.

Text to Sentences

API: /text2sentences
Removes the parts of the text that seem to be boilerplate, leaving the real sentences.
Try it for yourself. Copy and paste a large chunk of text into the box below to see what sentences it identifies.
Get Sentences

HTML to Text

API: /html2text
Returns the full text that would actually be displayed in the browser when an HTML document was rendered.
Try it for yourself. Copy and paste your HTML into the box below to convert it into plain text.
Get Text

HTML to Story

API: /html2story
Takes an HTML document representing a news article or similar page, and extracts just the story text.
Try it for yourself. Copy and paste your HTML into the box below to grab the story text.
Extract Story

Text to People

API: /text2people
Spots text fragments that look like people's names or titles, and guesses their gender where possible.
Try it for yourself. Copy and paste your text into the box below to extract people's names.
Find Names

Text to Times

API: /text2times
Spots text fragments that look like times or dates, and converts them into a standard form.
Try it for yourself. Copy and paste your text into the box below to extract times and dates.
Find Times

没有评论:

发表评论