DISCLOSURE: I am the developer of this library, which is based on my research, and this post includes self-promotion.
Hello everyone,
I have developed a custom ops library for PyTorch that implements paraboloid neurons in a mathematically optimal way. Under normal circumstances, a fully rotatable second degree surface requires O(d^2) parameters, where d is the dimensionality of the input space. Paraboloid neurons can be implemented with only 2d=O(d) parameters, by using the definition of a paraboloid as the locus of points that are equidistant from a hyperplane and a point.
There is a Free version of the library available through pip. The Free version restricts commercial use (use in AI competitions is allowed, even if there is a monetary reward) and includes the unoptimized first implementation that worked, but it is otherwise fully featured. You can install it using pip, however be mindful to install the version that matches your PyTorch version:
v2.9.0 - v2.10.0: pip3 install geondptfree==2.9.0.1.0.post1v2.7.0 - v2.8.0: pip3 install geondptfree==2.7.0.1.0.post1v2.5.1 - v2.6.0: pip3 install geondptfree==2.5.1.1.0.post1
At this moment, the custom ops offered are the Paraboloid op, an equivalent of PyTorch’s Linear op and ParaConv2d, an equivalent of PyTorch’s Conv2d. There is also a specialized SGD optimizer that is required to properly handle paraboloid layers, named GeoNDSGD (Adam and AdamW are under development). For more details, please refer to https://geond.tech.
I invite you to try them out in your projects and would be very interested in hearing the results. Any feedback is welcome and I will be available to answer any questions and advise on how to use the library.
The ops are designed to be able to easily replace their PyTorch equivalents simply by changing the name of the op and leaving all other arguments the same. They do have additional arguments that can be fined tuned, but I believe their default values are pretty good. There are some small examples on CIFAR10 available on GitHub: GitHub - GeoND-tech/pytorch-cifar-paraboloid: Paraboloid neurons demo on CIFAR10 with PyTorch that showcase the recommended usage of paraboloid neurons. More specifically:
-
Insert a Paraboloid layer right before the Linear output layer. This should increase the error function performance. Whether or not this translates into improved accuracy can depend on luck on easier datasets (such as CIFAR10), but I believe that on harder datasets it should be more likely to consistently improve accuracy as well.
-
Replace the base convolutional layer with ParaConv2d. You can also use fewer units, as paraboloid neurons are more powerful.
-
Avoid directly replacing the output layer, as I have been unable to get better results in either error function or accuracy. However, if you want to try it anyway, you should probably use init=‘spotlight’ in the declaration.
-
Avoid introducing more than 1 paraboloid layer at a time, as this makes it harder to assess the effect each individual change has and you may miss a better network because of it.
-
Use at least 200 epochs, employ weight decay, momentum with nesterov=True and a Cosine Annealing Learning Rate scheduler, e.g.:
epochs=300
optimizer = gpt.GeoNDSGD(net.parameters(), lr=0.1, momentum=0.9, weight_decay=5e-4, nesterov = True)
scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(optimizer, T_max=epochs)
Regarding the Licensed version of the library, it includes the most heavily optimized implementation of the ops that are significantly faster (up to 10x), however the computations performed should be identical to the Free version, up to numerical accuracy issues due to the order of operations. An internet connection is required to validate the license. The license restricts how many instances of the library can be used concurrently to train neural networks. For example, with 512 instances you can concurrently train 16 networks on 32 GPUs each, 64 networks on 8 GPUs each, or any combination that does not exceed a total of 512. Furthermore, purchasing a license also give you access to an Inference version of the library that 1) only includes the forward passes, 2) is not restricted, 3) does not expire up to the version acquired during an active license and 4) does not require an internet connection. Verified academic institutions are also acquire a heavily discounted license that also prohibits commercial use.
Finally, I am considering making a GitHub repository with all the source code of the free version of the Paraboloid op, for both PyTorch and TensorFlow, that only uses g++/gcc and nvcc, if there is enough interest.
Happy experimentation.