11-25-2014 03:16 AM - edited 11-25-2014 03:21 AM
Hello!
I have been trying to obain table data from a web page without success. The data can be copy and pasted from the web page but it can´t be found in the web page source. I prefer NOT to use automated mouse clicks to obtain the data through CTRL-A, CTRL-C and then to the Clipboard ( I know how to do this but it is not so elegant 🙂 ). I tried to get the data automatically with the following code: (See snippet)
I have also looked at the existing solutions in the forum for Loging in to a website but the difference here is that I am just interested in getting out the data from a Table (form or element? I am not so familiar with HTML nor JAVA)
Any tips are appreciatted.
11-25-2014 03:53 AM
It depends....if the data is hidden from the web-page source this could mean one of the following:
1) If you can view the table when you visit the page (without being logged in) but it isn't in the page source then that means that the table is loaded by Javascript/AJAX after the page loads. You'll probably have to get clever with some Javascript to read the table after the page has loaded but I can't immediately see a way to do this.
2) If you can't view the table when you visit the page because you're not logged in then LabVIEW won't be able to either. You would have to see if you can emulate logging in by POSTing your login information to the login page URL and see if it can log you in.
11-25-2014 05:21 AM
I'm not particularly familiar with HTML or Javascript myself, but I can say that you should probably use the .NET browser and not the ActiveX one, as shown in a quick example here - http://forums.ni.com/t5/LabVIEW/Mouse-event-in-embedded-internet-explorer/m-p/2853440#M832224
That example doesn't actually read the value of the elements, so you might need to do some more work there (I'm guessing maybe casting the element reference to its specific class and then getting the relevant properties). The advantage of using the .NET browser is that its data types are generally easier to work with and that it will probably be easier to search other sites to see what people have done with it.
11-25-2014 05:32 AM - edited 11-25-2014 05:35 AM
Hello Sam_Sharp
Point 2 is out of the question since Login is not required.
Point 1 is difficult for me since I am not familiar with Java.
Here is an example of a webpage: http://www.flashscore.com/basketball/
There you have a Table called USA:MBA, whose values can´t be observed in the Page Source.
If I use Google Chromes Scraper tool and right-click to the forst row of the "USA:MBA" table, the Scraper tool window opens and displays the data for that row and also the following Xpath:
//table[2]/tbody/tr[1]/td
The question is if this Xpath above can be used somehow with Labviews IHTML Property nodes to obtain the same values as Google Chromes Scraper tool?
Tst: Thanks, I will have a look at the link you provided
11-25-2014 06:20 AM
The trick is to get the current html of the page and not the source - the source will not render the javascript of the page which is what loads/populates the table. It's the difference between inspecting an element in Chrome and viewing the source.
You'll need to see if there's a method/property that returns the 'live' html document object model after the Javascript has populated the table rather than the source of the page.
You may also have more success trying to replicate the same method the webpage uses to actually load the data..it probably makes a request to a page which returns the scores information in a more 'raw' format rather than having to parse out the html elements yourself.
(Also as an aside - Java is not the same as Javascript - one is a compiled language and one is a scripting language).