Cycles in family tree software

276,084

Solution 1

It seems you (and/or your company) have a fundamental misunderstanding of what a family tree is supposed to be.

Let me clarify, I also work for a company that has (as one of its products) a family tree in its portfolio, and we have been struggling with similar problems.

The problem, in our case, and I assume your case as well, comes from the GEDCOM format that is extremely opinionated about what a family should be. However this format contains some severe misconceptions about what a family tree really looks like.

GEDCOM has many issues, such as incompatibility with same sex relations, incest, etc... Which in real life happens more often than you'd imagine (especially when going back in time to the 1700-1800).

We have modeled our family tree to what happens in the real world: Events (for example, births, weddings, engagement, unions, deaths, adoptions, etc.). We do not put any restrictions on these, except for logically impossible ones (for example, one can't be one's own parent, relations need two individuals, etc...)

The lack of validations gives us a more "real world", simpler and more flexible solution.

As for this specific case, I would suggest removing the assertions as they do not hold universally.

For displaying issues (that will arise) I would suggest drawing the same node as many times as needed, hinting at the duplication by lighting up all the copies on selecting one of them.

Solution 2

Relax your assertions.

Not by changing the rules, which are mostly likely very helpful to 99.9% of your customers in catching mistakes in entering their data.

Instead, change it from an error "can't add relationship" to a warning with an "add anyway".

Solution 3

Here's the problem with family trees: they are not trees. They are directed acyclic graphs or DAGs. If I understand the principles of the biology of human reproduction correctly, there will not be any cycles.

As far as I know, even the Christians accept marriages (and thus children) between cousins, which will turn the family tree into a family DAG.

The moral of the story is: choose the right data structures.

Solution 4

I guess that you have some value that uniquely identifies a person on which you can base your checks.

This is a tricky one. Assuming you want to keep the structure a tree, I suggest this:

Assume this: A has kids with his own daughter.

A adds himself to the program as A and as B. Once in the role of father, let's call it boyfriend.

Add a is_same_for_out() function which tells the output generating part of your program that all links going to B internally should be going to A on presentation of data.

This will make some extra work for the user, but I guess IT would be relatively easy to implement and maintain.

Building from that, you could work on code synching A and B to avoid inconsistencies.

This solution is surely not perfect, but is a first approach.

Solution 5

You should focus on what really makes value for your software. Is the time spent on making it work for ONE consumer worth the price of the license ? Likely not.

I advise you to apologize to this customer, tell him that his situation is out of scope for your software and issue him a refund.

Share:
276,084
Partick Höse
Author by

Partick Höse

Updated on August 14, 2020

Comments

  • Partick Höse
    Partick Höse over 3 years

    I am the developer of some family tree software (written in C++ and Qt). I had no problems until one of my customers mailed me a bug report. The problem is that the customer has two children with their own daughter, and, as a result, he can't use my software because of errors.

    Those errors are the result of my various assertions and invariants about the family graph being processed (for example, after walking a cycle, the program states that X can't be both father and grandfather of Y).

    How can I resolve those errors without removing all data assertions?

    • kidsid49
      kidsid49 almost 13 years
    • Thomas
      Thomas almost 13 years
      If you trace your family tree backwards far enough, you will hit this problem far more often than you would like. Abandoning the tree representation may be painful but would ultimately be more correct.
    • pgod
      pgod almost 13 years
      You shouldn't add assertions for unlikely things, only impossible things. Cycles are the obvious things that aren't possible in a family tree graph... no one can be his own ancestor via any method. These other assertions are just bogus and should be removed.
    • thomasrutter
      thomasrutter almost 13 years
      @pgod instead of failing on these assertions, I think it's helpful to report unlikely things to the user to let them know they may have made a mistake but provide an easy and non-patronising method to proceed anyway.
    • kaleissin
      kaleissin almost 13 years
      This is not at all a silly question in the world of pet breeding. Daughter to father, mother to son, sister to brother, grandchildren to grandparents is standard technique there, and pet breeders need family tree software too. "Pure-bred" my ¤%#&.
    • johnsyweb
      johnsyweb almost 13 years
      This question is a great illustration about assumptions that software engineers make when producing code. Assertions have no place in production code for situations that could happen.
    • Xeo
      Xeo almost 13 years
      Are you guys seriously trying to close this as subjective and argumentative? The question is totally clear to me, it's asking for a solution to cycles in a tree / graph.
    • Rebecca Chernoff
      Rebecca Chernoff almost 13 years
      Do you get the same cycle error if cousins married, for instance a child of cousins that married would also be a child of the parent but also a cousin of the parent. This is a common scenario.
    • rtperson
      rtperson almost 13 years
      Marrying first cousins was very common in Victorian England, especially among the upper classes (it was an excellent way to keep money within the family). Charles Darwin, for instance, married his first cousin, Emma Wedgwood. Any family tree software needs to support situations like this.
    • bgw
      bgw almost 13 years
      FYI, got linked here from Reddit
    • Noldorin
      Noldorin over 12 years
      The thing is, it's not technically a tree any more, in the mathematical sense. As soon as you introduce (incestuous) cycles, it becomes a fully-fledged graph. So tell the customer to go buy a "family graph" program tailored for incestuous hicks. :-)
    • onosendai
      onosendai over 11 years
      One of those rare cases where the crash report should go to the police not the developer.
    • Mooing Duck
      Mooing Duck about 11 years
      @pgod: If a person marries the mother of a man who marries his mother, then he's his own grandpa. Being your own ancestor genetically is impossible, but socially it's not. And most family tree software tracks both.
    • minopret
      minopret over 10 years
      On 2013-02-07 I made a suggestion. Maybe it will yet be helpful to someone. If necessary, one may compromise with the difficult constraints of GEDCOM by entering a "person" that incorporates a group of people who have unsupported relationships. To illustrate the point, I called my example "Daddy-Daughter Inc." I openly admit that this is unpleasant toward people and fails to capture some facts in machine-readable form. Technically, does it not patch over a local problem in the GEDCOM representation of a family? Oh well. It was downvoted to -3 without comment, so I deleted it from the answers.
    • Dave Cousineau
      Dave Cousineau over 9 years
      This is why it's good to challenge assertions strongly when gathering requirements. "X doesn't happen". Ok, but does that mean that "X doesn't normally happen", that "X will "never" happen", or that "X cannot possibly happen"?
    • chepner
      chepner over 9 years
      No family tree (outside of time-travel fiction) has cycles, but the problem here is that your program doesn't support directed acyclic graphs, where one node can serve as more than one type of ancestor to another. Such graphs may be undesirable either morally, socially, or legally, but they are not biologically impossible.
  • Partick Höse
    Partick Höse almost 13 years
    Probably such "proxy" nodes are indeed suitable solution. However I have no idea how can those be put in user interface without offending user. I can tell you that writing software that deals with real people (especially your customers) is not easy.
  • Bo Persson
    Bo Persson almost 13 years
    It never ends - B's new son will be his own uncle. I would consider a full refund for the program!
  • Null Set
    Null Set almost 13 years
    @Will A: And then realizes he is also his own mother, and recruits his younger self into the time agency?
  • thomasrutter
    thomasrutter almost 13 years
    When encountering a very unlikely situation, that is, one where a user would usually only do it by mistake, it is a good idea to show the user a warning. That's good feedback. But then let the user go ahead if they are really sure they want to. So I think this is a good answer, even if it doesn't get into the nuts and bolts of how.
  • phooji
    phooji almost 13 years
    Agree with 'when to use assertions' argument; don't see how it relates to 'some languages have asserts, Go doesn't.'
  • Zaur Nasibov
    Zaur Nasibov almost 13 years
    Good answer! I just wonder, how this kind of software will handle "I am my own grandpa" (youtube.com/watch?v=eYlJH81dSiw) situation?
  • Arlen
    Arlen almost 13 years
    @Tim Post If you know it's impossible, then why bother asserting it?
  • Tim Post
    Tim Post almost 13 years
    @Red Hue - sometimes compilers make the impossible ... possible. Some versions of gcc think -10 == 10 in the abs() implementation.
  • cHao
    cHao almost 13 years
    @Red Hue: The whole point of assertions is to document and test conditions that should always be true (or false). It helps keep you (and others) from "fixing" things in such a way that those impossible cases arise, as then they'd explicitly (rather than subtly) break the app. If there's a valid reason for an "impossible" case to appear, then you've asserted too much.
  • Arlen
    Arlen almost 13 years
    @cHao @Tim Post I'm just trying to understand why Go not having assertions is a good thing since most of you agree that assertion is important to have.
  • Paul Harrison
    Paul Harrison almost 13 years
    This looks like the right approach, and it's easy enough to extend to detect more complex problems. You can work out a set of "A happened before B" relationships between events. For example, that a person was born before any other events involving them. This is a directed graph. You could then check that the graph contains no cycles. See this question on StackOverflow. This should be ok until time travel is invented.
  • cHao
    cHao almost 13 years
    @Red Hue: I imagine that point of view is based on cases like the original question, where assertions are being a bit abused. Asserting too much (or too little) is more common than many people will realize, because people rarely learn to use assertions properly. It's also a dev-time thing, which often won't be tested in production anyway, so some would conclude that the best way to handle it is to get rid of assertions altogether. I disagree, but have some sympathy for the reasoning behind it.
  • Prof. Falken
    Prof. Falken almost 13 years
    Very true. But also weigh other potential problems with similar troubles others have brought up.
  • Bert Goethals
    Bert Goethals almost 13 years
    @paul-harrison If it where only that simple. In older records (even new ones) there are date inconsistencies. Baptism before birth, multiple birth records etc... So to an extent, in official records, there is time travel. We allow this inconsistent data. We allow users to indicate what the application should consider "the" birth record in case of duplicates. And we'll indicate broken timelines if found.
  • Bert Goethals
    Bert Goethals almost 13 years
    Duplication (and syncing) of data within one system is bad practice. It indicates that the solution is sub optimal and should be reconsidered. If creating extra (duplicate) nodes would be needed, indicate it as a proxy and delegate the data reads and writes to the original node.
  • manixrock
    manixrock almost 13 years
    It would need a further restriction of every node having 1 or 2 maximum nodes pointing to it for in vitro and sexual reproduction. Although to be more true to real life, you might allow multiple dashed lines for uncertain descendancy on the father side (it's always clear who the mother is, but only DNA testing can insure who the father is, and that's rarely done even today), or even for both is adoption is taken into account.
  • christopheml
    christopheml almost 13 years
    Of course. The reasoning is : if it's a rare edge case on a non-critical application, you are not require to fix or implement anything. If it's really hurting your users, there's value in working on it.
  • bdwakefield
    bdwakefield almost 13 years
    This is not really an answer, because I think the problem comes from actually traversing the tree? However, it is a good suggestion.
  • Prof. Falken
    Prof. Falken almost 13 years
    If a language has assertions or not built in will hardly much impact on if developers will use an assertion pattern, or not, in their code. Many I am sure even do it without knowing that it's called an assertion...
  • Ben Voigt
    Ben Voigt almost 13 years
    @bdwakefield: The question was "How do I resolve these errors, without removing all data assertions?" I believe I've answered that.
  • Peter Recore
    Peter Recore almost 13 years
    @manixrock - since this question is about rare cases, i would like to assert that is not always clear who the mother is. adoptions, abandoned babies, surrogate moms, etc can all complicate matters.
  • rm999
    rm999 almost 13 years
    @Ben It depends on what the assertions are for. If they prevent infinite loops or fatal errors from happening, then you are effectively suggesting to remove the assertions. If they are just there to warn a user of a potential mistake, then your answer is a good one.
  • Ben Voigt
    Ben Voigt almost 13 years
    @rm999: The question is pretty clear that the assertions are enforced after traversing the data structure.
  • rm999
    rm999 almost 13 years
    @Ben I think we are reading the question differently. I interpret "cycle" to mean a cycle in the graph (which would mean during traversal). This makes sense in the context of the question: OP assumes a tree structure, and a child having the same father and grandfather is a loop and hence not a tree. Many tree traversal algorithms would fail on a graph with loops, so if my reading of the question is correct bdwakefield has an entirely legitimate point.
  • Tommy McGuire
    Tommy McGuire almost 13 years
    Having assertions (or assertion-like code) is irrelevant. Code in languages like Go can and will make assumptions about the structure of data; it just can't document and enforce those assumptions with assertions. Bottom line: the application has a bug.
  • Steve Kalemkiewicz
    Steve Kalemkiewicz almost 13 years
    Your comment made me think of polygamy. Genealogy software that only models sexual reproduction may require a name attached to the sperm and the egg but broader definitions of family structure do not.
  • Bert Goethals
    Bert Goethals almost 13 years
    @ben-voigt GEDCOM is a format created by the The Church of Jesus Christ of Latter-day Saints. The specification clearly states that marriage (MARR) is to be between men and women. For same sex marriage or incest the ASSO tag should be used (ASSOCIATES), also used to indicate friendship or being neighbours. It is clear the same sex marriage is second class relation within this spec. A more neutral spec would not demand male female relations.
  • Ed Ropple
    Ed Ropple almost 13 years
    It's not necessarily acyclic, is it? Man-marries-grandmother.
  • exDM69
    exDM69 almost 13 years
    Man marrying his on grandmother will not make himself his own grandfather and adding a cycle. If they have children, it will be a non-cycling regular graph edge.
  • datenwolf
    datenwolf almost 13 years
    Probably everybody has some case of incest somewhere in his/her ancestry. So you'll hit that bump if one digs family history (too) deep.
  • JSacksteder
    JSacksteder over 12 years
    It's actually TWO ADGs. There is the parentage graph and the legal relationship graph. Usually the same, but divergent more than one might expect.
  • musiphil
    musiphil over 12 years
    The ln -s command doesn't work that way; the resolution of the link Family/Son/Father will look for Family/Son/Daughter/Father from under Family/Son, where the link resides, not from . where you issued the ln -s command.
  • gvd
    gvd about 12 years
    This question has nothing to do with language features.
  • Agrajag
    Agrajag about 12 years
    Generally speaking there -will- be cycles. Just not cycles of the "parent of" type of relationship. There can very well be cycles in relationships generally, such as "married to", especially when you consider that these relationships change over time.
  • Agrajag
    Agrajag about 12 years
    Careful about those assumptions; one male and one female parent isn't a given when people adapt, or lesibans who consider themselves as parents, in the near future they may even be able to really be biologically the parents, atleast of girls. For that matter, if we apply dolly to humans, even the assumption "a person has two distinct parents" is out.
  • Todd Hopkinson
    Todd Hopkinson about 12 years
    Genealogy software will often allow more than one spouse in the model. How you display the model in the view varies widely, even within one program, depending on the "mode" that has been provided.
  • tfinniga
    tfinniga about 12 years
    @Agrajag, yes that's why I specified "biologically speaking" for the cycle detection. Even biologically, there are lots of possible issues, like surrogate mothers and artificial insemination. If you also allow adoptions and other non-biological methods for defining parents, then it's possible to have a valid true cycle in a tree - for example, maybe someone adopts their grandparent when they get old and are no longer able to take care of themselves. Making assumptions about people's family life is always complicated. But when writing software you need to make some assumptions..
  • Agrajag
    Agrajag about 12 years
    Assuming perfect knowledge and biology, there won't be any cycles in biological ancestry, but in the real world genealogy is full of "perhaps"-links. manxrock above, for example, asserts that it's always clear who the mother is. Which is not true in the real world. Yes, today DNA-testing can answer that question, but how are you going to get your relative from 5 generations back DNA-tested ?
  • MikeIsrael
    MikeIsrael over 11 years
    cloning is prohibited by the geneva conventions
  • sjas
    sjas over 11 years
    Sadly, way too many people first think of 'ok' data instead of the edge cases that break their systems.
  • Bulwersator
    Bulwersator almost 11 years
    Making genealogy tree of some weird situation (inbreed royalty, Fritzl etc) is valid use of software.
  • Pierre
    Pierre over 9 years
    @Bert Goethals: You are confusing GEDCOM with certain programs that do not support same-sex marriage (PAF, Legacy). GEDCOM does not preclude constructs such as "0 @F1@ FAM/1 HUSB @I1@/1 HUSB @I2@", and thus supports same-sex marriages if your software chooses to.
  • Bert Goethals
    Bert Goethals over 9 years
    @Pierre You can cheat the system indeed. This is directly from the 5.5.1 docs: "MARR {MARRIAGE}: = A legal, common-law, or customary event of creating a family unit of a man and a woman as husband and wife." (homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gcappa.h‌​tm) As you can see, no same sex marriage here.
  • Pierre
    Pierre over 9 years
    @Bert Goethals: Here is a test which you can easily perform yourself: [1] Create a same-sex marriage in a Family Tree Maker file; [2] export to GEDCOM; [3] Import the GEDCOM file into RootsMagic. The same-sex marriage is preserved (vg. both partners are still male). How did that happen? Don't confuse what the spec. recommends, versus what is actually possible.
  • Mahdi
    Mahdi over 9 years
    A family tree software that won't allow second cousins to marry is useless. Nearly all families has atleast one case of this. Which is why I think the original example is made up for effect.
  • Bert Goethals
    Bert Goethals almost 9 years
    @TylerH I did mean at least. Also any "relation" could be expressed as a multiple of pairs. Eg: a threesome can be expressed as 2 relations per individual.
  • Bert Goethals
    Bert Goethals almost 9 years
    @Pierre My post is about the Spec; not about what other applications try to achieve by breaking the spec. A spec "specifies" not recommends. The fact that implementers are breaking the spec indicates that the spec is indeed flawed.