Automatic Connector Cutout Recognition (Part 1)

December 16, 2018


NOTE: If you are coming to this page and have no idea what this site is about, check out the home page or FAQ!

It has been some time since I last updated the EnclosureGenerator application, largely due to my own time constraints. However I have not stopped working on improvements, albeit at a slower pace than I would like. This blog post is about an interesting research project I have been working on related to automatic connector cutout recognition for PCBA enclosures. A smarter cutout recognition algorithm is a feature that I have wanted to develop for some time, and over the last few weeks I was finally able to get around to it. It is not yet ready for prime time, but I believe it is a step in the direction of fully automated generative enclosure design.

Problem Description

What components need cutouts? Of those components, which component side needs a cutout? Once you know which side of a given component needs a cutout, what should the cutout look like?

An IDF file contains all of the information used by the EnclosureGenerator in its current state, but an IDF file does not contain all important information about a printed circuit board assembly (PCBA). One of the most critical pieces of information pertaining to enclosures that an IDF lacks is the geometry of components outside of the basic component footprint. This becomes a problem when a component is one that needs a cutout (such as a connector).

Currently the EnclosureGenerator tool allows a user to specify the direction of cutout required for a given component, but it is frustrating to use and it runs counter to the spirit of generated design with minimal user input. It would be great if a user could simply upload a 3D model of the circuit board, and the connectors are identified and cutout directions set for those connectors automatically, and the cutout sized appropriately to the connector. Most people familiar with circuit boards would quickly identify the USB, HDMI, audio jack, Ethernet, header pins, and ZIF connectors of the Raspberry Pi shown, and would know on which side to make a cutout. This is easy for a person to figure out, less so for a computer.


Creating a general purpose component cutout recognition/creation capability using only the information in a 3D PCBA file is quite a challenging task, so I simplified the problem to only identifying what part of a connector needs a cutout. In other words, I will assume that the 3D model of the connector has been extracted from the assembly model containing the PCB and other components, and I will ignore the need to create a cutout sized for the specific component. So now the problem looks more like this:

This is an image of a 3D model of an Ethernet connector. Given such a model of an Ethernet connector (or any other PCBA component), which part of the connector needs a cutout?

Given this simplification, my initial inclination was to create a library and lookup table of components and cutouts, where a component name would be tied to a specific cutout. This would require a large amount of manual work up-front (manually identifying component sides for a large number of components), but would be fast and reliable once the table had been built. The problem with this solution is that it likely wouldn't work "in the wild". The reason for this is that I cannot control what name a user might have given to a component, and it won't work at all for those PCBA 3D files that are all a single 3D file, and do not contain references to individual component names (the Raspberry Pi image above was extracted from such a file). This leads to using the 3D geometry itself as the "key" to the lookup table - I wasn't sure how to implement this, and I believe it is still an area of active research.

Before going down that rabbit hole, I thought I would try a simpler approach, which is to extract a set of images from different views of the 3D model, and then classify the views as needing a cutout or not needing a cutout. The image that is classified as needing a cutout will be tied to a specific side of the connector, and that side can then be given to the EnclosureGenerator tool as the desired cutout side. Image classification has become something that is quite easy to do using any of a large number of machine learning libraries, and this approach seemed to be the quickest way to yield any interesting results. It would also allow me to play with the FastAI library, which I had been meaning to try out.


My first attempt at connector classification involved a dataset of 1007 images taken from several hundred different components taken from Octopart and the KiCAD open source repository of 3D components. I extracted images based on an orthographic projection, grabbing images of each of 6 sides of a given component. After some filter and transformation, the images typically looked like the ones seen below:

Example connector and component images used for neural net training (from left to right): top view of an IDC Header connector, front view of an HDMI connector, top view of same HDMI connector, front view of a D-SUB connector, top view of an microSD slot

Some component CAD models are quite simple, not much more than a rectangle, whereas others (such as the HDMI connector shown) were relatively full-featured. Although there wasn't a great deal of data at this point I wanted to see what kind of performance I could get (if any). Using the library built on top of PyTorch I was able to quickly test a Resnet34 architecture neural network for classifying images as either a "connector opening" or "not a connector opening". Largely following the first lesson in the excellent FastAI course, I was able to achieve ~90% accuracy rate on the validation dataset. This was actually better than I thought it would be, but still not much better than just assigning the "not a connector opening" label to all images (88%). However, upon looking at the confusion matrix it is fairly obvious that there are a large number of false negatives. This could be due to a large number of reasons, but I believe the primary reasons are:

Confusion matrix generated from initial testing of ResNet34 architecture using FastAI.

After fixing the image sizes to all be 200x200 by adding buffering whitespace, the performance improvement was pretty minimal - 92%. Looking at what the neural network was most confident in, it was clear that my connector dataset was heavily skewed towards header pin-style connectors. The figures below show some the connector images for which the learner was most correct and incorrect. The numbers below each image indicate the probability the learner considered the image "not a connector". In other words, the images with a very small number indicate the learner is very confident that that image is a connector, whereas images with a large number indicate the learner thought the image was not a connector (even though it was a connector).

Connector images where the learner was correct and very confident in a "connector" classification.

Connectors the learner was very incorrect in classifying. The sharp-eyed among you will notice some DIP switches in there. DIP switches are of course not connectors, but as it would often be useful to have a cutout available for access to a DIP switch, I considered them worthy of a cutout. Clearly the learner does not agree.

The only fix for this was to grow the number of component images in order to create a more robust classifier. After scouring the web for more component files, extracting images from them, and manually classifying those images, I created a dataset of ~7000 classified component images ready to go. These components ranged from pin headers, USB connectors, resistors, IC of various types and packages, and much more. After training a ResNet34 architecture neural net on the new data, the model's accuracy was improved to 95.5%! The confusion matrix of the predicted labels for the validation dataset can be seen below:

Confusion matrix after adding a lot of new learning data.

The learner is still best at identifying pin header-style connectors, as well as certain kinds of Molex connectors of which there were a large number in the dataset. However some kinds of connectors, such as RJ45 Ethernet or MicroSD card connectors, were very difficult for the classifier and typically were mis-classified or had a very low confidence score.

These are pretty confusing images to the model as it stands. Likely due to insufficient training data, although I must admit that microSD connectors were often difficult for me to correctly classify, especially given the variable amount of model detail.

Possible Next Steps


That is all for now, stay tuned for future updates, and happy holidays!

If you have any recommendations on new features, please take this survey and let me know what you would like in a future release!