| Home | Register | FAQ | Members List | Search | Today's Posts | Mark Forums Read |
| DesignersTalk > Extract Data from site and put into tables? |
|
LinkBack | Thread Tools | Search this Thread |
|
|
#1 (permalink) |
|
Registered User
Join Date: Apr 2006
Posts: 3
|
Brief Overview: I am trying to figure out a way to extract data from a site and organize the data into a table.. Details: When I run a search through my local MLS System, I get a output of Properties that are available on the market. The output data is in a list form with multiple columns (Address, Style, Price, etc..) I would like to be able to extract the info from specific columns and put them into a table on my website. I need to make this as automated as possible because this will be done on a weekly basis.. Is this at all possible? I have been searching around for a easy way to do this but have not had any luck.... Some insight on how to handle this would be greatly appreciated! -iSellJerseyShore |
|
|
|
|
|
#2 (permalink) |
|
Everything is fine.
|
Yes I would imagine this is possible by all means. However, without seeing any reasonable data and it's structure it would be impossible for us to offer any advice towards a solution. Perhaps if you posted some dummy output then someone on here might be able to help you out. You would also need to advise on: how you perform your queries, how does your MLS System return the data - output to screen or file, is it a web based service or a desktop program etc etc. There are many vital pieces of information needed in order to give you a specific answer to your question. Alternatively if this is too much for you to handle on your own, you could hire a programmer to work with you and create an automated solution (or as close as possible). - Mike |
|
|
|
#3 (permalink) |
|
Registered User
Join Date: Apr 2006
Posts: 3
|
Mike, Thanks for the quick response! I have attached a copy of the dummy query to this post.. The MLS is a web based service, I run the query from my browser. The output is displayed within the browser. I can hightlight an copy all of the output like any other webpage. Thanks Again! -iSellJerseyShore Last edited by iSelJerseyShore : 19-04-2006 at 16:21. |
|
|
|
#4 (permalink) |
|
Registered User
Join Date: Apr 2006
Posts: 3
|
What I am trying to extract is the Property Address, Style, Sale Price, Sale Date & DOM and publish that data in a table on my site.. *note* I replace the originally posted Query.txt file, with the Easy_Query.txt file attached to this form. The other file was not clean output like this file... Thanks! -iSellJerseyShore |
|
|
|
#5 (permalink) |
|
Everything is fine.
|
There is some bulky formatting code in there which is a shame, it makes the job of data extraction a lot harder. After looking through it you probably could extract the data you wanted from the markup MLS generates. You would need a server-side scripting language, such as Perl or PHP, and some regular expressions to tell it what data to remove - this method is known as "Screen Scraping". The downsides to this however, is that every time the company (or whoever is in control of the query output) decides to make a change to the HTML code generated, your script will need to be re-written in places to cater for the new change(s). I am also guessing the results are generated after a standard form submission so your script would need to be able to mimick that submission (including any user authentication requests) before retrieving and parsing the data. Some companies offer what's known as API access where you can retrieve data from a database in a specialy formatted way, such as an XML file or tab delimited file. This makes data extraction a lot leaner and quicker without the overheads & worries of Screen Scraping. I'm guessing the MLS system doesn't offer this type of access ? You would also need to decide the frequency of your extractions. Do you perform the extraction whenever the data is requested or do you run it at a set time every day and update a local file ? The second method is often preferred for data that changes every few hours, in essence you are storing a local 'cache' of the data output. Far less overheads than performing the extraction "On Demand". Overall, if you are local for a general path to take and a place to start looking, I would suggest PHP as your language of choice. PHP supports everything needed to perform your request. Hope this helps get you going in the right direction. Perhaps some of DT'ers will chime in and offer some alternative and/or better advice. - Mike PS: There could further issues with the system that prevent this procedure/system from working, however, if it is a simple as you have posted then all should be fine. |
|
![]() |