Family tree || kuro5hin.org


	create account \| help/FAQ \| contact \| links \| search \| IRC \| site news

Everything

We need your support: buy an ad | premium membership

Family tree

By Edmund Blackadder in Edmund Blackadder's Diary
Tue Oct 30, 2012 at 11:31:25 AM EST
Tags: neural networks, hinton, coursera, family relationships, family tree, toy example, logicagent, vector encodings, natural language representation, preprocessing (all tags)

In Lecture 4 of Coursera's Neural Networks for Machine Learning, Hinton presents an example of a family tree: http://subbot.org/logicagent/dialogs/familytree.png

He says he hand-coded a neural net to answer questions about the tree. The relevant paper, which is one of the seminal early works on back propagation, is at http://www.cs.toronto.edu/~hinton/absps/naturebp.pdf
It seems that the network doesn't accept natural-language input. The questions have to be submitted in the form of input vectors containing numbers. So names are encoded in a vector, and relationships such as "has-aunt", "has-father", etc. are encoded in another vector; these two vectors are presented as the inputs to the neural network. The output is a vector representation of a name or names that satisfies the relationship.
In my view, there is heavy preprocessing required to turn a simple question such as "Who are Colin's aunts?" into a form the network can accept. I would rather simply ask the program:
---
> who are Colin's aunts?
jennifer and margaret.
> who are Charlotte's grandmothers?
penelope and christine
> Who is Arthur's sister?
victoria
> Are Victoria and Arthur siblings?
Yes, Victoria and Arthur are siblings.
> Who is James's father?
andrew
> Who married Andrew?
Andrew married christine.
---
(See http://subbot.org/logicagent/dialogs/familytree5.txt ; http://subbot.org/logicagent/dialogs/familytree.yaml ; http://subbot.org/logicagent )
I have encoded several of Hinton's example 12 relationships in symbolic rules (expressed in Ruby): http://subbot.org/logicagent/dialogs/familytreerules.txt
I think I can get to the relationships not yet encoded in rules, from the existing rules. For example, I haven't (yet) included a rule that encodes the "has-son" relationship. But I can answer questions about the "has-son" relationship by asking a series of other questions:
---
sh-3.2# ruby logicbot.rb
Hello
I have loaded /Users/bob/logicagent/logicagent-api.yaml.
> logicbot: load rules from familytreerules.txt
Okay, familytreerules.txt read; 20 rules.
> logicbot: load graph from familytree.yaml
familytree.yaml loaded.
> Who is James's son?
I don't know who James's son is.
> Who did James marry?
victoria married James.
> Who did Victoria and James have?
James had colin, charlotte and charlotte.
> is Colin male?
Yes, Colin is male.
> Colin is James's son.
Okay, Colin is James's son.
> Who is James's son?
James's son is colin
>
---
To answer "who is James's son", the program needs to ask itself, "who did James marry", then "who did those two people have", then which of those children are male. (Note: the program doesn't understand "those two people" yet; I have an idea how to do that type of anaphora and hope to test it soon...)
The logicagent program in its current state has some bugs; for example, it repeats Charlotte's name.
It is tedious to encode the series of questions outlined above into a rule that will find someone's sons. The problems come in error-checking and in dealing with the different format of responses. For example one answer might be "colin and charlotte" while another way of answering the same question might list the names with commas separating them: "colin, charlotte". Making a rule that deals with these different possibilities is tiresome and time-consuming. (There is also the problem of iterating through responses, generalizing it to cases where there are more than two children...)
However, Hinton too says that he had to hand-code his network. And he trained it with 1500 passes of 100 out of 104 possible triples. I would like to see those triples, and the training times. How long did it take to code the network?
I would like to get a working copy of Hinton's network, and test it against mine. I would like to include his program as an agent in my multi-agent system. My theory is that logicagent may handle certain inputs better, and other agents such as a neural network might handle other inputs better. The user can use feedback to teach the system to select the best agent response for different questions.

Sponsors

Managed Hosting
VoxCAST Content Delivery
Raw Infrastructure

Family tree | 12 comments (12 topical, editorial, 0 hidden)

I did some real geneology a while back.. (none / 1) (#1)
by claes on Tue Oct 30, 2012 at 12:38:22 PM EST

there's a standard format called GEDCOM, and a bunch of tools handle it. You can even get downloads directly from the mormans: https://familysearch.org/

Interesting thing is that sometimes there are typos, or you get the wrong person in the wrong place (i.e. confuse a sister and a daughter in a big household census record). There are a bunch of sanity checks you could do, or use soundex for matching names.

Interesting stuff.

Protip: (none / 1) (#2)
by Enlarged to Show Texture on Tue Oct 30, 2012 at 01:09:53 PM EST

See if it can handle the tree of a family in Appalachia

"Those people who think they know everything are a great annoyance to those of us who do." -- Isaac Asimov

>Who is Cletus's sister? by Harry B Otch, 10/30/2012 01:22:20 PM EST (3.00 / 3)

>So then, who is Darla's mother???? by Harry B Otch, 10/30/2012 02:08:44 PM EST (3.00 / 2)

the program needs to ask itself, (none / 0) (#4)
by tdillo on Tue Oct 30, 2012 at 02:04:43 PM EST

"who did James marry"

What about Jane and Victoria? What if Colin's daddy was a deadbeat crackhead that raped Victoria?

What if James found out through DNA testing that he was NOT the daddy and Leroy Johnson the Pool Boy was? What if Victoria had a sex change and became Victor then who is Colin's daddy? Or if Victoria was married to James but had been having an incestuous affair with her father?

Who's your daddy now?

_{I never knew someone actually invented the internet. I thought it was discovered and was always there like oxygen or something.
-Facebook User}

Sounds like a bad Jane Austen novel by Harry B Otch, 10/30/2012 02:22:18 PM EST (none / 1)
Substitute 'fuck' for 'marry' if you like by Edmund Blackadder, 10/30/2012 05:41:18 PM EST (1.50 / 2)

Then you have in vitro fertilization by tdillo, 10/30/2012 07:04:11 PM EST (none / 0)

point: can it learn exceptions, as we do? by Edmund Blackadder, 10/30/2012 07:37:46 PM EST (1.50 / 2)

What's the point? (none / 1) (#7)
by ksandstr on Tue Oct 30, 2012 at 04:53:10 PM EST

It doesn't come with a front-end. You can just as easily hack up a cdecl-like syntax for this sort of thing.

What does this have to do with neural networks, anyway? IIRC one of the least good ways to study neural nets is by replacing a well-known function (in this case, traditionally implemented as search over a particular database) with a neural network that's been trained with that function and an input query generator. Baking the database into the function, and therefore the network.

At that point you learn about how much information that network can represent after being trained in that specific manner. Which turns out to be bloody low, considering that the network won't answer any other type of query.

Fin.

To test claims that rules are impractical for this by Edmund Blackadder, 10/30/2012 05:37:18 PM EST (1.50 / 2)

This is interesting (none / 1) (#12)
by procrasti on Wed Oct 31, 2012 at 05:23:15 PM EST

The purpose of the ancestor tree problem is just to see if a neural network can learn (and generalize) a couple of specific examples of a family tree... Its not a good system for parsing and analyzing family trees in general.

But, the example has gotten you used to representing and thinking of things as feature vectors. This is basically the entire way of communicating with neural networks - and you might be familiar with bag of words models which is also a feature vector representation.

In Programming Assignment 2, you build a word predictor that, in its training, creates a good general purpose representation of words. Perhaps you could use this in parsing your questions...

Maybe you could train a network on the sort of question/answer style 'dialogues' you currently have with your bot...

Question --> Word Representations --> Questions Representation --> (Trained Knowledge) --> Answer Representation --> Word Representation --> Answer

<-- kr5ddit
-------
if i ever see the nickname procrasti again on this site or anywhere in my life, i want it to be in an OBITUARY -- CTS
doing my best at licking arseholes - may 2015 -- mirko
-------
Winner of Kuro5hin: April 2015
-------
kr5ddit.com - You're front page to the internet^tm.

Family tree | 12 comments (12 topical, 0 editorial, 0 hidden)

All trademarks and copyrights on this page are owned by their respective companies. The Rest © 2000 - Present Kuro5hin.org Inc.
See our legalese page for copyright policies. Please also read our Privacy Policy.
Kuro5hin.org is powered by Free Software, including Apache, Perl, and Linux, The Scoop Engine that runs this site is freely available, under the terms of the GPL.
Need some help? Email help@kuro5hin.org.
My heart's the long stairs.