Cycles in family tree software
Solution 1
It seems you (and/or your company) have a fundamental misunderstanding of what a family tree is supposed to be.
Let me clarify, I also work for a company that has (as one of its products) a family tree in its portfolio, and we have been struggling with similar problems.
The problem, in our case, and I assume your case as well, comes from the GEDCOM format that is extremely opinionated about what a family should be. However this format contains some severe misconceptions about what a family tree really looks like.
GEDCOM has many issues, such as incompatibility with same sex relations, incest, etc... Which in real life happens more often than you'd imagine (especially when going back in time to the 1700-1800).
We have modeled our family tree to what happens in the real world: Events (for example, births, weddings, engagement, unions, deaths, adoptions, etc.). We do not put any restrictions on these, except for logically impossible ones (for example, one can't be one's own parent, relations need two individuals, etc...)
The lack of validations gives us a more "real world", simpler and more flexible solution.
As for this specific case, I would suggest removing the assertions as they do not hold universally.
For displaying issues (that will arise) I would suggest drawing the same node as many times as needed, hinting at the duplication by lighting up all the copies on selecting one of them.
Solution 2
Relax your assertions.
Not by changing the rules, which are mostly likely very helpful to 99.9% of your customers in catching mistakes in entering their data.
Instead, change it from an error "can't add relationship" to a warning with an "add anyway".
Solution 3
Here's the problem with family trees: they are not trees. They are directed acyclic graphs or DAGs. If I understand the principles of the biology of human reproduction correctly, there will not be any cycles.
As far as I know, even the Christians accept marriages (and thus children) between cousins, which will turn the family tree into a family DAG.
The moral of the story is: choose the right data structures.
Solution 4
I guess that you have some value that uniquely identifies a person on which you can base your checks.
This is a tricky one. Assuming you want to keep the structure a tree, I suggest this:
Assume this: A
has kids with his own daughter.
A
adds himself to the program as A
and as B
. Once in the role of father, let's call it boyfriend.
Add a is_same_for_out()
function which tells the output generating part of your program that all links going to B
internally should be going to A
on presentation of data.
This will make some extra work for the user, but I guess IT would be relatively easy to implement and maintain.
Building from that, you could work on code synching A
and B
to avoid inconsistencies.
This solution is surely not perfect, but is a first approach.
Solution 5
You should focus on what really makes value for your software. Is the time spent on making it work for ONE consumer worth the price of the license ? Likely not.
I advise you to apologize to this customer, tell him that his situation is out of scope for your software and issue him a refund.
Partick Höse
Updated on August 14, 2020Comments
-
Partick Höse over 3 years
I am the developer of some family tree software (written in C++ and Qt). I had no problems until one of my customers mailed me a bug report. The problem is that the customer has two children with their own daughter, and, as a result, he can't use my software because of errors.
Those errors are the result of my various assertions and invariants about the family graph being processed (for example, after walking a cycle, the program states that X can't be both father and grandfather of Y).
How can I resolve those errors without removing all data assertions?
-
kidsid49 almost 13 years
-
Thomas almost 13 yearsIf you trace your family tree backwards far enough, you will hit this problem far more often than you would like. Abandoning the tree representation may be painful but would ultimately be more correct.
-
pgod almost 13 yearsYou shouldn't add assertions for unlikely things, only impossible things. Cycles are the obvious things that aren't possible in a family tree graph... no one can be his own ancestor via any method. These other assertions are just bogus and should be removed.
-
thomasrutter almost 13 years@pgod instead of failing on these assertions, I think it's helpful to report unlikely things to the user to let them know they may have made a mistake but provide an easy and non-patronising method to proceed anyway.
-
kaleissin almost 13 yearsThis is not at all a silly question in the world of pet breeding. Daughter to father, mother to son, sister to brother, grandchildren to grandparents is standard technique there, and pet breeders need family tree software too. "Pure-bred" my ¤%#&.
-
johnsyweb almost 13 yearsThis question is a great illustration about assumptions that software engineers make when producing code. Assertions have no place in production code for situations that could happen.
-
Xeo almost 13 yearsAre you guys seriously trying to close this as subjective and argumentative? The question is totally clear to me, it's asking for a solution to cycles in a tree / graph.
-
Rebecca Chernoff almost 13 yearsDo you get the same cycle error if cousins married, for instance a child of cousins that married would also be a child of the parent but also a cousin of the parent. This is a common scenario.
-
rtperson almost 13 yearsMarrying first cousins was very common in Victorian England, especially among the upper classes (it was an excellent way to keep money within the family). Charles Darwin, for instance, married his first cousin, Emma Wedgwood. Any family tree software needs to support situations like this.
-
bgw almost 13 yearsFYI, got linked here from Reddit
-
Noldorin over 12 yearsThe thing is, it's not technically a tree any more, in the mathematical sense. As soon as you introduce (incestuous) cycles, it becomes a fully-fledged graph. So tell the customer to go buy a "family graph" program tailored for incestuous hicks. :-)
-
onosendai over 11 yearsOne of those rare cases where the crash report should go to the police not the developer.
-
Mooing Duck about 11 years@pgod: If a person marries the mother of a man who marries his mother, then he's his own grandpa. Being your own ancestor genetically is impossible, but socially it's not. And most family tree software tracks both.
-
minopret over 10 yearsOn 2013-02-07 I made a suggestion. Maybe it will yet be helpful to someone. If necessary, one may compromise with the difficult constraints of GEDCOM by entering a "person" that incorporates a group of people who have unsupported relationships. To illustrate the point, I called my example "Daddy-Daughter Inc." I openly admit that this is unpleasant toward people and fails to capture some facts in machine-readable form. Technically, does it not patch over a local problem in the GEDCOM representation of a family? Oh well. It was downvoted to -3 without comment, so I deleted it from the answers.
-
Dave Cousineau over 9 yearsThis is why it's good to challenge assertions strongly when gathering requirements. "X doesn't happen". Ok, but does that mean that "X doesn't normally happen", that "X will "never" happen", or that "X cannot possibly happen"?
-
chepner over 9 yearsNo family tree (outside of time-travel fiction) has cycles, but the problem here is that your program doesn't support directed acyclic graphs, where one node can serve as more than one type of ancestor to another. Such graphs may be undesirable either morally, socially, or legally, but they are not biologically impossible.
-
-
Partick Höse almost 13 yearsProbably such "proxy" nodes are indeed suitable solution. However I have no idea how can those be put in user interface without offending user. I can tell you that writing software that deals with real people (especially your customers) is not easy.
-
Bo Persson almost 13 yearsIt never ends - B's new son will be his own uncle. I would consider a full refund for the program!
-
Null Set almost 13 years@Will A: And then realizes he is also his own mother, and recruits his younger self into the time agency?
-
thomasrutter almost 13 yearsWhen encountering a very unlikely situation, that is, one where a user would usually only do it by mistake, it is a good idea to show the user a warning. That's good feedback. But then let the user go ahead if they are really sure they want to. So I think this is a good answer, even if it doesn't get into the nuts and bolts of how.
-
phooji almost 13 yearsAgree with 'when to use assertions' argument; don't see how it relates to 'some languages have asserts, Go doesn't.'
-
Zaur Nasibov almost 13 yearsGood answer! I just wonder, how this kind of software will handle "I am my own grandpa" (youtube.com/watch?v=eYlJH81dSiw) situation?
-
Arlen almost 13 years@Tim Post If you know it's impossible, then why bother asserting it?
-
Tim Post almost 13 years@Red Hue - sometimes compilers make the impossible ... possible. Some versions of gcc think -10 == 10 in the abs() implementation.
-
cHao almost 13 years@Red Hue: The whole point of assertions is to document and test conditions that should always be true (or false). It helps keep you (and others) from "fixing" things in such a way that those impossible cases arise, as then they'd explicitly (rather than subtly) break the app. If there's a valid reason for an "impossible" case to appear, then you've asserted too much.
-
Arlen almost 13 years@cHao @Tim Post I'm just trying to understand why Go not having assertions is a good thing since most of you agree that assertion is important to have.
-
Paul Harrison almost 13 yearsThis looks like the right approach, and it's easy enough to extend to detect more complex problems. You can work out a set of "A happened before B" relationships between events. For example, that a person was born before any other events involving them. This is a directed graph. You could then check that the graph contains no cycles. See this question on StackOverflow. This should be ok until time travel is invented.
-
cHao almost 13 years@Red Hue: I imagine that point of view is based on cases like the original question, where assertions are being a bit abused. Asserting too much (or too little) is more common than many people will realize, because people rarely learn to use assertions properly. It's also a dev-time thing, which often won't be tested in production anyway, so some would conclude that the best way to handle it is to get rid of assertions altogether. I disagree, but have some sympathy for the reasoning behind it.
-
Prof. Falken almost 13 yearsVery true. But also weigh other potential problems with similar troubles others have brought up.
-
Bert Goethals almost 13 years@paul-harrison If it where only that simple. In older records (even new ones) there are date inconsistencies. Baptism before birth, multiple birth records etc... So to an extent, in official records, there is time travel. We allow this inconsistent data. We allow users to indicate what the application should consider "the" birth record in case of duplicates. And we'll indicate broken timelines if found.
-
Bert Goethals almost 13 yearsDuplication (and syncing) of data within one system is bad practice. It indicates that the solution is sub optimal and should be reconsidered. If creating extra (duplicate) nodes would be needed, indicate it as a proxy and delegate the data reads and writes to the original node.
-
manixrock almost 13 yearsIt would need a further restriction of every node having 1 or 2 maximum nodes pointing to it for in vitro and sexual reproduction. Although to be more true to real life, you might allow multiple dashed lines for uncertain descendancy on the father side (it's always clear who the mother is, but only DNA testing can insure who the father is, and that's rarely done even today), or even for both is adoption is taken into account.
-
christopheml almost 13 yearsOf course. The reasoning is : if it's a rare edge case on a non-critical application, you are not require to fix or implement anything. If it's really hurting your users, there's value in working on it.
-
bdwakefield almost 13 yearsThis is not really an answer, because I think the problem comes from actually traversing the tree? However, it is a good suggestion.
-
Prof. Falken almost 13 yearsIf a language has assertions or not built in will hardly much impact on if developers will use an assertion pattern, or not, in their code. Many I am sure even do it without knowing that it's called an assertion...
-
Ben Voigt almost 13 years@bdwakefield: The question was "How do I resolve these errors, without removing all data assertions?" I believe I've answered that.
-
Peter Recore almost 13 years@manixrock - since this question is about rare cases, i would like to assert that is not always clear who the mother is. adoptions, abandoned babies, surrogate moms, etc can all complicate matters.
-
rm999 almost 13 years@Ben It depends on what the assertions are for. If they prevent infinite loops or fatal errors from happening, then you are effectively suggesting to remove the assertions. If they are just there to warn a user of a potential mistake, then your answer is a good one.
-
Ben Voigt almost 13 years@rm999: The question is pretty clear that the assertions are enforced after traversing the data structure.
-
rm999 almost 13 years@Ben I think we are reading the question differently. I interpret "cycle" to mean a cycle in the graph (which would mean during traversal). This makes sense in the context of the question: OP assumes a tree structure, and a child having the same father and grandfather is a loop and hence not a tree. Many tree traversal algorithms would fail on a graph with loops, so if my reading of the question is correct bdwakefield has an entirely legitimate point.
-
Tommy McGuire almost 13 yearsHaving assertions (or assertion-like code) is irrelevant. Code in languages like Go can and will make assumptions about the structure of data; it just can't document and enforce those assumptions with assertions. Bottom line: the application has a bug.
-
Steve Kalemkiewicz almost 13 yearsYour comment made me think of polygamy. Genealogy software that only models sexual reproduction may require a name attached to the sperm and the egg but broader definitions of family structure do not.
-
Bert Goethals almost 13 years@ben-voigt GEDCOM is a format created by the The Church of Jesus Christ of Latter-day Saints. The specification clearly states that marriage (MARR) is to be between men and women. For same sex marriage or incest the ASSO tag should be used (ASSOCIATES), also used to indicate friendship or being neighbours. It is clear the same sex marriage is second class relation within this spec. A more neutral spec would not demand male female relations.
-
Ed Ropple almost 13 yearsIt's not necessarily acyclic, is it? Man-marries-grandmother.
-
exDM69 almost 13 yearsMan marrying his on grandmother will not make himself his own grandfather and adding a cycle. If they have children, it will be a non-cycling regular graph edge.
-
datenwolf almost 13 yearsProbably everybody has some case of incest somewhere in his/her ancestry. So you'll hit that bump if one digs family history (too) deep.
-
JSacksteder over 12 yearsIt's actually TWO ADGs. There is the parentage graph and the legal relationship graph. Usually the same, but divergent more than one might expect.
-
musiphil over 12 yearsThe
ln -s
command doesn't work that way; the resolution of the linkFamily/Son/Father
will look forFamily/Son/Daughter/Father
from underFamily/Son
, where the link resides, not from.
where you issued theln -s
command. -
gvd about 12 yearsThis question has nothing to do with language features.
-
Agrajag about 12 yearsGenerally speaking there -will- be cycles. Just not cycles of the "parent of" type of relationship. There can very well be cycles in relationships generally, such as "married to", especially when you consider that these relationships change over time.
-
Agrajag about 12 yearsCareful about those assumptions; one male and one female parent isn't a given when people adapt, or lesibans who consider themselves as parents, in the near future they may even be able to really be biologically the parents, atleast of girls. For that matter, if we apply dolly to humans, even the assumption "a person has two distinct parents" is out.
-
Todd Hopkinson about 12 yearsGenealogy software will often allow more than one spouse in the model. How you display the model in the view varies widely, even within one program, depending on the "mode" that has been provided.
-
tfinniga about 12 years@Agrajag, yes that's why I specified "biologically speaking" for the cycle detection. Even biologically, there are lots of possible issues, like surrogate mothers and artificial insemination. If you also allow adoptions and other non-biological methods for defining parents, then it's possible to have a valid true cycle in a tree - for example, maybe someone adopts their grandparent when they get old and are no longer able to take care of themselves. Making assumptions about people's family life is always complicated. But when writing software you need to make some assumptions..
-
Agrajag about 12 yearsAssuming perfect knowledge and biology, there won't be any cycles in biological ancestry, but in the real world genealogy is full of "perhaps"-links. manxrock above, for example, asserts that it's always clear who the mother is. Which is not true in the real world. Yes, today DNA-testing can answer that question, but how are you going to get your relative from 5 generations back DNA-tested ?
-
MikeIsrael over 11 yearscloning is prohibited by the geneva conventions
-
sjas over 11 yearsSadly, way too many people first think of 'ok' data instead of the edge cases that break their systems.
-
Bulwersator almost 11 yearsMaking genealogy tree of some weird situation (inbreed royalty, Fritzl etc) is valid use of software.
-
Pierre over 9 years@Bert Goethals: You are confusing GEDCOM with certain programs that do not support same-sex marriage (PAF, Legacy). GEDCOM does not preclude constructs such as "0 @F1@ FAM/1 HUSB @I1@/1 HUSB @I2@", and thus supports same-sex marriages if your software chooses to.
-
Bert Goethals over 9 years@Pierre You can cheat the system indeed. This is directly from the 5.5.1 docs: "MARR {MARRIAGE}: = A legal, common-law, or customary event of creating a family unit of a man and a woman as husband and wife." (homepages.rootsweb.ancestry.com/~pmcbride/gedcom/55gcappa.htm) As you can see, no same sex marriage here.
-
Pierre over 9 years@Bert Goethals: Here is a test which you can easily perform yourself: [1] Create a same-sex marriage in a Family Tree Maker file; [2] export to GEDCOM; [3] Import the GEDCOM file into RootsMagic. The same-sex marriage is preserved (vg. both partners are still male). How did that happen? Don't confuse what the spec. recommends, versus what is actually possible.
-
Mahdi over 9 yearsA family tree software that won't allow second cousins to marry is useless. Nearly all families has atleast one case of this. Which is why I think the original example is made up for effect.
-
Bert Goethals almost 9 years@TylerH I did mean at least. Also any "relation" could be expressed as a multiple of pairs. Eg: a threesome can be expressed as 2 relations per individual.
-
Bert Goethals almost 9 years@Pierre My post is about the Spec; not about what other applications try to achieve by breaking the spec. A spec "specifies" not recommends. The fact that implementers are breaking the spec indicates that the spec is indeed flawed.