splitting a list into two distinct sublists


Let's say you have the following list, or string:


How would you find the point between abab and cdcd? i.e the point where the substrings contain distinct characters from one another?

1s = "ababcdcd"
2for i in range(1, len(s)):
3    if not set(s[:i]) & set(s[i:]):
4        print(s[:i], s[i:])
5        break

This solution converts the substrings into sets, and & them. When the intersection of both sets is empty, both substrings are distinct.

The solution can also be expressed as a generator expression:

1s = "ababcdcd"
2gen = ((s[:i], s[i:]) for i in range(1, len(s)) if not set(s[:i]) & set(s[i:]))

Getting all distinct substrings

Printing out all distinct substrings is also straightforward:

1s = "ababcdcdefefef"
2indices = [0] + [i for i in range(1, len(s)) if not set(s[:i]) & set(s[i:])] + [len(s)]
3print([s[i:j] for i, j in zip(indices, indices[1:])])


1['abab', 'cdcd', 'efefef']

See also:

shifted zip