Documents contain a lot of information. We'll introduce you to a variety of techniques to extract them.
Machine Learning techniques are useful for analyzing numeric data, but they can also be useful for classifying text, extracting content and more. We will discuss a variety of open source tools for extracting the content, identifying elements and structure and analyzing the text can be used in distributed, microservice-friendly ways.
Brian Sletten is a liberal arts-educated software engineer with a focus on forward-leaning technologies. His experience has spanned many industries including retail, banking, online games, defense, finance, hospitality and health care. He has a B.S. in Computer Science from the College of William and Mary and lives in Auburn, CA. He focuses on web architecture, resource-oriented computing, social networking, the Semantic Web, data science, 3D graphics, visualization, scalable systems, security consulting and other technologies of the late 20th and early 21st Centuries. He is also a rabid reader, devoted foodie and has excellent taste in music. If pressed, he might tell you about his International Pop Recording career.