Log in

27 March 2017 @ 04:06 pm
The regular icosahedron hides many regular pentagons, surrounding each of its vertices. Cut it through the planes of two parallel pentagons, and you decompose it into two pyramids and a pentagonal antiprism. But what if you add a second antiprism, and glue the result back together?

The result is a 17-vertex triangulated planar graph called the Errera graph. Its main claim to fame is as one of the counterexamples to Kempe's false proof of the four color theorem, but it also describes the shape of clusters of gold atoms. Its dual graph (a planar graph with 12 pentagonal faces and 6 hexagonal faces) is one of the fullerenes, although too small to be a stable one.

This construction from an icosahedron shows that the Errera graph can be realized as a non-convex deltahedron, a polyhedron in which all faces are equilateral triangles. It can also be made convex, but at the expense of making its faces non-regular. You can also think of the same construction as forming the Errera graph from two icosahedra by slicing a vertex off each one and gluing them together on the resulting pentagonal faces.

I had somehow acquired the impression that the Errera graph got its name as an unusual spelling of "error", from the fact that it was used to point out Kempe's error. But that impression itself turns out to be erroneous. It's actually someone's name: Belgian mathematician Alfred Errera, who published it in his 1921 doctoral dissertation.
Denis Kurz, a student of Petra Mutzel at Dortmund, visited Irvine last year. Since Kurz had already done some work on k shortest simple paths with Mutzel, that's what we worked on during his visit. Our paper is now online as a preprint: "K -Best Solutions of MSO Problems on Tree-Decomposable Graphs", arXiv:1703.02784.

If you just want the k shortest paths between two vertices in a graph, and you don't care whether a path might have repeated vertices in it, you can do it in constant time per path after a near-linear amount of preprocessing — this is the result from my FOCS'94 / SICOMP'98 paper "Finding the k shortest paths", still my most heavily cited paper. This works well, for instance, when the input is a DAG, because in that case repeated vertices are impossible. But it's less interesting in road networks: you wouldn't want to ask your GPS for an alternative route and be told to follow the same route as before except for turning into a short cul-de-sac and then coming back out of it, somewhere in the middle of the route.

What is often used instead is an algorithm for k shortest simple paths: each path has to avoid repeated vertices and edges. They can still be found, but not as efficiently. Although heuristic improvements have been developed, the best algorithm (in terms of its worst-case complexity) is still the one from a 1971 paper by Jin-Yu Yen, whose time is O(mn) per path, much slower. And there's some evidence in a FOCS'10 paper of Williams and Williams that this time bound, which is cubic for dense graphs, is the best possible: it can't be improved to have a better exponent of n than three without making similar improvements to several other famous algorithmic problems. Their result doesn't rule out moving the cubic part of the cost from being per-path to being part of the preprocessing, but I don't know how to do that either.

So instead, Denis and I considered whether the time for finding simple paths could be improved if we assumed that the graphs had some additional structure. The answer is yes: for graphs of bounded treewidth we can find the k shortest simple paths in only logarithmic time per path. It's not quite constant, but it is exponentially faster per path than Yen.

It's not entirely surprising that an improvement is possible for bounded-treewidth graphs (a lot of things are easier for them) but proving it involved a novel combination of algorithmic components including shallow tree decompositions, logical formulations of graph properties, dynamic graph data structures, path-copying partial persistence, and selection in heap-ordered infinite trees.

In part because of the generality of these tools, our main result also ended up being very general: instead of just applying to simple paths, we can find the k best solutions to a broad family of combinatorial optimization problems on graphs, basically the problems that can be solved efficiently using Courcelle's theorem. For instance, we can find the k best traveling salesperson tours, no less efficiently than the k shortest simple paths.
19 February 2017 @ 07:28 pm
A penny graph (the topic of today's new Wikipedia article) is what you get from pennies by shoving them together on a tabletop, keeping them only one layer thick, and looking at which pairs of pennies are touching each other. In a 2009 paper, Konrad Swanepoel suggested that, when there are no three mutually-touching pennies, the number of adjacencies should be at most 2n − 2√n. I don't know how to prove that, and as far as I know the problem remains open. But here's an argument for why the number of edges should be strictly less than 2n − 4, the upper bound given by Swanepoel.

The basic idea is that, in any set of triangle-free pennies, some penny is touched at most twice. Or, in graph theoretic terms, a triangle-free penny graph must be 2-degenerate. For, suppose to the contrary that all pennies had at least three neighbors, and consider the outer face of the planar graph formed by the pennies (marked by the blue polyline below). For each penny on the outer face, draw a line through its center and the center of a neighboring penny that is not one of its two outer-face neighbors (the red lines in the figure).

As you walk around the outer face (clockwise, say) these red lines would always have to point consistently inwards and outwards, so they would have to turn with you, eventually making a complete circuit in the clockwise direction. But they can only turn counterclockwise! If you follow an outer face edge whose adjacent inner face is a quadrilateral, the red lines stay parallel (as they do in most of the picture) and otherwise they turn the wrong way. The impossibility of the red lines making a complete clockwise turn while only turning counterclockwise at each step shows that our assumption, that all pennies have three neighbors, cannot be correct. Therefore, there is a penny on the outer face with at most two neighbors.

Five-vertex triangle-free penny graphs have at most five edges, so by induction n-vertex triangle-free penny graphs have at most 2n − 5 edges, very slightly edging out an upper bound of 2n − 4 given by Swanepoel based on Euler's formula.

The fact that penny graphs are 3-degenerate is a standard exercise, part of an easy proof of the four-color theorem for these graphs. Similarly, the 2-degeneracy of triangle-free penny graphs leads to an easy proof of Grötzsch's theorem for them. So I would be surprised if the 2-degeneracy of the triangle-free penny graphs is new, but I didn't see it when I was researching the Wikipedia article. Does anyone know of a reference?
The following image shows the onion layers of a 6 × 6 grid (see Sariel's post for context). It consists of 42 circles, 36 with a fill and a stroke and 6 with only a stroke.

My usual workflow is to draw this sort of thing in Adobe Illustrator, but one drawback of doing things this way is huge files: as a PDF file suitable for inclusion in LaTeX, these 42 circles take up 155k bytes. I've recently started cutting my file size by opening the Illustrator PDF in the Apple Preview application, and then using Preview's "export" command with the "reduce file size" filter. This cuts it down to 41k, still nothing impressive but a lot better. It does have the drawback that going between Illustrator and Preview multiple times has sometimes messed up my fonts, so I think it's best viewed as a one-way filter: if I want to re-edit the same file, I need to keep the original.

But if I save as an SVG file from Illustrator (like the one above) it's only 3.3K, a reasonable size for such a simple image. Reading the SVG back into Illustrator and saving as PDF blows it back up to 88k, which is still way too big. So I thought: maybe there's a good command line tool for converting SVG to PDF? I could have a workflow where I use Illustrator only to read and edit SVG files (keeping them around so that I can re-edit them if I want, without as much confusion as keeping two different PDF versions of the same file) and use a different program to convert them to small PDFs.

After some searching, I found CairoSVG, which (after a lot of hassle installing, see below) seemed to work well: it produces a 3.5k byte PDF file from the same image.

The installation process was a bit of a mess. Very different from typical OS X download-and-run installation processes:
  1. Install pip from https://pip.pypa.io/en/stable/installing/
  2. Use pip to install CairoSvg per http://cairosvg.org/
  3. Try running cairosvg from the command line, but it doesn't work because it is unable to load the cairo library. After much searching, discover what the CairoSVG web site never tells you: that cairo is a separate piece of software that you also need to install.
  4. Install MacPorts from https://www.macports.org/install.php
  5. Try using MacPorts to install cairo, but discover that it requires the App version of Xcode and not just the command-line developer tools I already had installed.
  6. Install Xcode from the Apple App store
  7. Try MacPorts again, but discover that the App store install is not the real install.
  8. Run Xcode to install it for real.
  9. Use MacPorts to install cairo per https://www.cairographics.org/
  10. Try running cairosvg from the command line, but it still can't find the library.
  11. Searching the net for the error message eventually leads to https://github.com/Kozea/WeasyPrint/issues/79 which advises setting the environment variable DYLD_FALLBACK_LIBRARY_PATH to /opt/local/lib
  12. It used to be the case that setting environment variables globally was done by editing the file ~/.MacOSX/environment.plist but that no longer works — see http://stackoverflow.com/questions/603785/environment-variables-in-mac-os-x/ — so instead I've been putting them in my .login file.
  13. After editing .login to set the environment variable I need to make a new terminal window because the old ones won't have it set yet.
After all this, the command-line cairosvg runs, and the command line "cairosvg grid-onion.svg -o grid-onion.pdf" produces the small PDF file reported above.

But we're not done...

It turns out that the resulting PDF has a different image size than the original file. After some more groveling on the net I discovered that it's using a default value of 96 dots per inch rather than respecting the 72dpi value used by Illustrator. So when I view the resulting pdf files in Illustrator, Preview, or TeX, they don't have the same scale as when I drew them. This is a problem, because I've been using relative image sizes (the scale= parameter of graphicx) when including images in some of my LaTeX documents.

The cairosvg command line program has two options for changing the scale. We can either set the program to use 72 dpi, or we can tell it to scale the image by a factor of 1.33. Both generate images that have the correct size. But now the bounding box of the image has been destroyed, and it's instead using a bounding box that's way too big. This turns out to be true for svg to png conversion as well: if I try to use cairosvg to make a png from my svg file, it also loses the bounding box. There are no options for telling cairosvg to avoid destroying the bounding box, and no suggestion on the cairosvg issue tracking site that anyone knows or cares about this problem.

After all this, I'm ready to give up on cairosvg as being worth what I've paid for it (nothing). Does anyone know of an svg-to-pdf conversion pipeline that produces small pdf files but actually works?
14 February 2017 @ 06:27 pm
The tetrahedron and the Császár polyhedron are polyhedra whose edges and vertices form complete graphs; it is unknown whether any others exist. At the tutorial session for Graph Drawing 2015, I asked whether there are any complete bipartite polyhedra.

As the definition of non-convex polyhedra can be the subject of heated debate (check out Wikipedia:Talk:Polyhedron if you dare), I should clarify that the kind of polyhedron I am looking for is a piecewise-linearly-embedded manifold whose maximal linear pieces (the faces) are simple polygons. This is all true of the Császár polyhedron, for instance. For non-geometric definitions of polyhedra, the existence of complete bipartite polyhedra is already known; for instance Archdeacon ("a survey of self-dual polyhedra", 1992) describes a polyhedral embedding of K6,5 on a topological surface with 17 crosscaps, dual to K9 on that surface.

A near-miss geometric complete bipartite polyhedron is the complete bipartite graph K2,2, which forms the graph of a doubly covered square. This is in some sense a Euclidean convex polyhedron; at least, it satisfies the criteria of Alexandrov's characterization of convex polyhedra by their surface metrics. But its two faces coincide, rather than intersecting only at their shared vertices and edges. The regular octahedron forms the graph of a complete tripartite graph K2,2,2, and the cube forms a subgraph of K4,4 but is missing some edges (the long diagonals) needed to form the whole complete bipartite graph.

In a complete bipartite graph, represented in this way, each face must be a quadrilateral, because higher order faces would have diagonals that form edges of other faces, not possible in a manifold. This puts some constraints on which bipartite graphs we can have: the number of edges must be divisible by two (so that there is an integer number of quadrilaterals) and in addition the number of edges must be 2 (mod 4) if the number of vertices is odd, or 0 (mod 4) if the number of vertices is even, so that the Euler characteristic will be even. (Odd Euler characteristics only happen for non-oriented manifolds, which cannot be embedded without crossings in Euclidean space.) And K2,n is never possible, because it has vertices of degree two which must come from coplanar faces or collinear edges, neither of which is allowed by the specific definition of polyhedra chosen above.

Based on this calculation, the simplest complete bipartite graphs that could work are K4,4 and K3,6. Both of these form nice polyhedral subdivisions of a torus:

I have been unable to find manifold embeddings of these tori, but I also don't have a proof that they're impossible. I did, however, find a realization of K4,4 as a crossed polyhedron (meaning: a subdivision of a topological manifold, with its vertices mapped to distinct points of Euclidean space in such a way that the faces are mapped to distinct planes, but allowing edge and face crossings):

It's based on a complete quadrilateral in the z = 0 plane (the black lines) with two of its six points lifted and lowered in z to form pairs of points. In this arrangement of points, A and C form dart-shaped quadrilaterals with the blue points on the line x = 0 and with the other two blue points. Similarly, B and D form two darts with pairs of the two blue points. A and D form a tilted dart with the two blue points on the diagonal x = y, and B and C also form a tilted dart with the same two diagonal points. Finally, A and B form a crossed quadrilateral (an antiparallelogram) with the two blue points on the line x + y = 3, and B and D form another antiparallelogram with the same two blue points. So each edge of the complete bipartite graph is used twice, giving us a topological manifold.

This is not the torus above, because the face pairings are different. It turns out to be a different torus:

So, this does show that K4,4 can be mapped to space so that the points form eight planes of four points, intersecting in the desired edges, and forming a configuration of eight points and eight planes with four points per plane and four planes per point. But it doesn't really solve the original problem. Can it be untangled to form an embedded torus with the same connectivity?

ETA 2/16: K3,6 is also easy to realize as a crossed polyhedron. Place A, B, and C on a triangle, choose three planes through each pair of these points, and place the other six points where triples of these planes meet. But be careful to choose the correct triples of planes: if you place A, B, and C on an equilateral triangle, and the other six points at equal distances above and below the midpoints, you can realize K3,6 as a crossed polyhedron with three square sides and six antiparallelogram sides, but it's a Klein bottle, not a torus.
31 January 2017 @ 04:35 pm
I'm in Barbados for a week, but that isn't helping me avoid seeing the circus our head bozo has made.
22 January 2017 @ 07:31 pm
The view out my back door this afternoon:

I'm pretty sure the white hair-like things in the background are raindrops, motion-blurred into streaks.

I just replaced my laptop (the old one's keyboard and trackpad were dying), so if the colors are off or different from before, that's why. Also much fussing with Python ensued to get my old gallery-creation script running again, since the brain transplant didn't include any libraries I might have been depending on, and the versions I could figure out how to install in their place had a slightly different API. I'm just glad my old copy of Photoshop still runs and I don't have to switch to the subscription model or find an alternative.
18 January 2017 @ 05:24 pm
Everyone who plays tic-tac-toe quickly learns that the nine squares have different strengths. The center square is best, because it belongs to four lines. The corners, on three lines each, are next. And weakest are the squares in the middle of a side of the tic-tac-toe grid, which are only on two lines each. What about higher dimensions? (Let's ignore the fact that tic-tac-toe on a 3×3×... grid is an easy win in more than two dimensions.) Which are the weak cells, and which are the strong ones?

For instance, the image below shows a four-dimensional tic-tac-toe grid, projected to the plane. Each cell of the grid becomes a colored dot; the five colors distinguish the cells by the dimension of the hypercube face that they're the midpoint of. We have vertices (blue), edges (yellow), squares (red), cubes (white), and the center of the hypercube itself (black). But I've only shown a subset of the lines, four through each point. If I included all the lines, which colors would be more powerful (on more lines), and which less?

Read more...Collapse )
17 January 2017 @ 07:20 pm
When I cover directed acyclic graphs for my algorithms classes, I usually mention as an example (familiar to the students) a DAG with university courses as its vertices and with prerequisites as its edges.

Today I learned that this is a lie, or maybe I should say more of a lie than I already thought it was. It turns out that, in our course catalog, prerequisites are not just lists of courses that all must be taken first. Instead, even if we ignore the grade thresholds in some of them, they are complicated and-or formulas describing the combinations of courses that are allowed to be used as preparation. Some of them are written in conjunctive normal form (ors of ands), some are in disjunctive normal form (ands of ors), and at least one (our information retrieval course) requires multiple levels of parentheses in its formula.

I had thought that this was mostly because some courses that have different numbers really have the same content, and should be treated as equivalent for the purposes of the prerequisite ordering. In order-theoretic terms, that would mean that we have a quasi-order rather than a partial order. For instance, we went through a refactoring of our curriculum a few years back and our prerequisite listings still include the numbers for both our old sophomore data structures course and our new one. But that's just to avoid blocking the leftover students who took it before we changed; you wouldn't want to take both classes. Or, we have three different ways you can learn linear algebra: by doing lots of hand calculations (the way the engineers want), by learning about vector spaces as abstract coordinate-free things (the way the mathematicians want), or by interacting with MATLAB or the equivalent (the way the computer scientists, and especially the machine learning peopl want). But you would only take one of these, not all three. So my feeling was that, at least when you restricted the set of classes to the ones taken by any individual student, you would get a partial order.

But even that is not true. Take a look early in the linked catalog page, at our course CS 113, computer game development. Its prerequisites are one of computer graphics, AI, software design, critical analysis of games, graph algorithms, or game design. Why that list? I don't know, it's not my course. But it's completely reasonable to take more than one of these courses, in various different orderings, and the fact that it's an "or" of these courses rather than an "and" means that we're not treating this list the way we would for topological ordering of DAGs.

So, if not a DAG, what is the prerequisite structure? An antimatroid! Or almost. Once you've fulfilled the prerequisites for a course, you can take the course in any later term that it's offered — they don't go stale. More abstractly, an element, once available to be included in an ordering, remains available until it is used. And this is the defining property of an antimatroid. The "almost" is because there are also some constraints that you can't take both of some pairs of classes for credit, but this only applies if you look at the whole course catalog at once. If you look at what any individual student takes, I think it really is an antimatroid. Of course, I may not have examined the catalog closely enough to find the exceptions...
07 December 2016 @ 09:11 pm
My latest arXiv preprint is "Covering many points with a small-area box", arXiv:1612.02149, with de Berg, Cabello, Cheong, and Knauer. It is about finding an axis-parallel rectangle that covers k out of n points in the plane and has as small area as possible. Here, both k and n are variables, but k is expected to be significantly smaller than n. So, in finding a fast algorithm, the goal is first to minimize the exponent of n in the time bound, and only after that to minimize the dependence on k. We achieve a bound of O(nk2log n). There are also some related algorithms for approximating the dual problem where you are given a fixed area and want to maximize the number of points that are covered.

The result has a bit of a curious history. A group of us worked on it, and came up with the best algorithms we could, thinking it was a new and unsolved problem. Then, we did a more careful literature search and found that it was actually a known problem, and worse than that, that the time bounds we had achieved were all known or improved, so that we had no result. But then we looked again and discovered that the supposedly known bounds aren't actually true.

What happened was that there was a line of research (that I was part of) in the early 1990s on finding "small" subsets of a given number of points. A sequence of papers on this subtopic identified many different definitions of what it meant to be small (like having low circumradius, diameter, etc) that could all be solved by similar algorithms. These algorithms worked by covering the input by clusters of O(k) points, one of which could be guaranteed to include the optimal solution, and then worked separately within each cluster. A simple version of this is to choose O(k) nearest neighbors to each point and to use these n sets of near neighbors as the clusters; later refinements used fewer and easier-to-find clusters. The algorithms of this type varied by what they did inside the clusters, but this general method ensured that the dependence on n would be linear. And one of the problems solved using this approach was almost the same as the one in our new paper, but with axis-parallel rectangle perimeter in place of rectangle area.

Later, another paper studied the problem of finding smallest rectangles covering k points, but with k assumed to be very close to n rather than much smaller than n, so that factors of k in the runtime are bad and factors of (n − k) are good. It solved both the minimum-perimeter and minimum-area problems for the large-k case. Of course, it quoted the past work on the minimum perimeter problem for small k, but it didn't make clear that this past work solved only the case of perimeter minimization and not area minimization. And so, other papers, quoting that one, started also saying that the minimum-area k-point rectangle problem (for the small-k case) had also already been solved.

It hadn't, but now it has.
24 November 2016 @ 01:59 pm
Happy Thanksgiving to all. Here's a photo from a walk I took this morning at the San Joaquin Wildlife Sanctuary.

The gallery has a few more.