Coursetree 2.0: An Intelligent Backend Coming Soon

Jun 24 2011

The goal of coursetree 2.0 is to leverage the current cloud infrastructure to deliver semantic applications that help users find the information they are looking for.

Features currently planned:

Course search that understands what the user wants
Filtering of irrelevant links
Pattern based degree data mining

Draft implementation strategy:

Let Google search index Wikipedia and video links
A Bayesian classifier will be used to categorize link content into subjects
Template induction and template scraping

Features under consideration:

Adaptable prerequisite semantic analysis
Fully automated template learning and template extraction
Relevant course links/suggested courses

Tenative ideas:

Genetic algorithm for grammar rule generation with fitness score assigned according to the total number of parse errors
Use hashing algorithms to detect similarity in sections of a page, feed similar sections using wrapper induction to generate template
Build map of courses using anti-requisites and display nearest neighbors

Unsuccessful incubation features:

Using genetic algorithms to generate templates for wrapper induction
Switch to parse trees extract noun phrases for Wikipedia link candidates
YQL for video link scraping

Lessons learned and salvaged:

Don’t use genetic programming methods where scores cannot be assigned to each individual “program”, as many of the templates were simply fails with zero scores
Although in some cases successful (with a comma separated list of noun phrases), in other cases single words were marked as noun phrases in the parse tree instead of a more desirable longer phrase
Due to frequent changes in video sites, nested JavaScript callbacks with closures to glue previews to links made the code a target to be recycled

Tags: genetic programming, javascript, pattern recognition, ply, python, wrapper induction

No responses yet

Coursetree 2.0: An Intelligent Backend Coming Soon

Latest Posts

Feed on

Search

Monthly

Categories

Pages