Data Mining PDF documents; using data conversion to reduce analysis time
Problem A month ago, we became aware of a way to harvest legal notifications from a government web-site. Link Here The web-server allows simple requests to be crafted in order to download PDF documents related to court proceedings. After a few hours, we had over 25,000 PDF documents available to analyze. Now the question becomes: What is the [...]