Slot Wrapper Python
FunctionWrapper is a design pattern used when dealing with relatively complicated functions. The wrapper function typically performs some prologue and epilogue tasks like
- allocating and disposing resources
- checking pre- and post-conditions
- caching / recycling a result of a slow computation
but otherwise it should be fully compatible with the wrapped function, so it can be used instead of it. (This is related to the DecoratorPattern.)
- Func pyqtslot = pyqtSlot(wrappedslot) assert pyqtslot is wrappedslot return pyqtslot Given that the return value of pyqtSlot(wrappedslot) is wrappedslot, and the latter is not a method on a QObject but just a bare function (albeit a closure in the.
- Types expose slot wrappers to Python. Python objects place Python functions in type slots How do they do similar things? They are not equivalent! ʮ POF MJLF UIF PUIFS ʯ.
- Python 3.x introduced some Python 2-incompatible keywords and features that can be imported via the in-built future module in Python 2. It is recommended to use future imports it if you are planning Python 3.x support for your code. For example, if we want Python 3.x’s integer division behavior in Python 2, we can import it via.
As of Python 2.1 and the introduction of nested scopes, wrapping a function is easy:
The additional decorate function is needed to work with the Python 2.4 decorator syntax.
Now, let's wrap something up:
The wrapping effect is:
These names can't point directly to the C functions; instead they point to (you guessed it) special 'slot wrapper' objects, which encapsulate all the information necessary to actually call the C functions from Python. The one exception to this is new, which is done differently for reasons that I'll cover in a future entry.
Of course, a wrapper would normally perform some more useful task. Have a look here for a recipe how to wrap a function that processes files so that the result is recycled from a cache file if appropriate.
Overview¶
This document describes the guts of jpype. It is intendedlay out the architecture of the jpype code to aid intrepid lurkersto develop and debug the jpype code once I am run over by a bus.For most of this document I will use the royal we, except whereI am giving personal opinions expressed only by yours truly, theauthor Thrameos.
History¶
When I started work on this project it had already existed for over 10 years.The original developer had intended a much larger design with modules tosupport multiple languages such as Ruby. As such it was constructed withthree layers of abstraction. It has a wrapper layer over Java in C++, awrapper layer for the Python api in C++, and an abstraction layer intendedto bridge Python and other interpreted languages. This multilayer abstractionment that every debugging call had to drop through all of those layers.Memory management was split into multiple pieces with Java controlling aportion of it, C++ holding a bunch of resources, Python holding additionalresources, and HostRef controlling the lifetime of objects shared between thelayers. It also had its own reference counting system for handing Javareferences on a local scale.
This level of complexity was just about enough to scare off all but the mosthardened programmer. Thus I set out to eliminate as much of this as I could.Java already has its own local referencing system to deal in the form ofLocalFrames. It was simply a matter of setting up a C++ object tohold the scope of the frames to eliminate that layer. The Java abstractionwas laid out in a fashion somewhat orthagonally to the Java inheritancediagram. Thus that was reworked to something more in line which could besafely completed without disturbing other layers. The multilanguageabstraction layer was already pierced in multiple ways for speed. However,as the abastraction interwove throughout all the library it was a terriblelift to remove and thus required gutting the Python layer as well to supportthe operations that were being performed by the HostRef.
The remaining codebase is fairly slim and reasonably streamlined. Thisrework cut out about 30% of the existing code and sped up the internaloperations. The Java C++ interface matches the Java class hierachy.
Architecture¶
JPype is split into several distinct pieces.
jpype
Python moduleJ
._jpype
CPython moduleThe native module is supported by a CPython module called _jpype
. The _jpype
module is located in native/python
and has C style classes with a prefix PyJP
.
This CPython layer acts as a front end for passing to the C++ layer.It performs some error checking. In addition to the module functions in_JModule
, the module has multiple Python classes to support the native jpypecode such as _JClass
, _JArray
, _JValue
, _JValue
, etc.
native/python
and has the prefixJPPy
for all classes. jp_pythontypes
wraps the required parts ofthe CPython API in C++ for use in the C++ layer.native/common
. This layerhas the namespace JP
. The code is divided into wrappers for each Java type,a typemanager for mapping from Java names to class instances, support classesfor proxies, and a thin JNI layer used to help ensure rigerous use of the samedesign patterns in the code. The primary responsibility of this layer istype conversion and matching of method overloads.native/java
.The Java layer is divided into two parts,a bootstrap loader and a jar containing the support classes. The Javalayer is responsible managing the lifetime of shared Python, Java, and C++ objects.jpype
module¶
The jpype
module itself is made of a series of support classes whichact as factories for the individual wrappers that are created to mirroreach Java class. Because it is not possible to wrap all Java classeswith staticly created wrappers, instead jpype dynamically createsPython wrappers as requested by the user.
The wrapping process is triggered in two ways. The user can manuallyrequest creating a class by importing a class wrapper with jpype.importsor JPackage
or by manually invoking it with JClass
. Or the class wrappercan be created automatically as a result of a return type or exceptionthrown to the user.
Because the classes are created dynamically, the class structureuses a lot of Python meta programming.Each class wrapper derives from the class wrappers of each of thewrappers corresponding to the Java classes that each class extendsand implements. The key to this is to hacked mro
. The mro
orders each of the classes in the tree such that the most drivedclass methods are exposed, followed by each parent class. Thismust be ordered to break ties resulting from multiple inheritanceof interfaces. The factory classes are grafted into the type systemusing __instancecheck__
and __subtypecheck__
.
resource types¶
JPype largely maps to the same concepts as Python with a few special elements.The key concept is that of a Factory which serves to create Java resourcesdynamically as requested. For example there is no Python notation tocreate a int[][]
as the concept of dimensions are fluid in Python.Thus a factory type creates the actual object instance type withJArray(JInt,2)
Like Python objects, Java objects derives from atype object which is called JClass
that serves as a meta type forall Java derived resources. Additional type like object JArray
and JInterface
serve to probe the relationships between types.Java object instances are created by calling the Java class wrapper justlike a normal Python class. A number of pseudo classes serve as placeholdersfor Java types so that it is not necessary to create the type instancewhen using. These aliased classes are JObject
, JString
, andJException
. Underlying all Java instances is the concept of ajvalue
.
jvalue
¶
In the earlier design, wrappers, primitives and objects were all seperateconcepts. At the JNI layer these are unified by a common element calledjvalue. A jvalue
is a union of all primitives with the jobject. The jobjectcan represent anything derived from Java object including the pseudo classjstring.
This has been replaced with a Java slot concept which holds an instance ofJPValue
which holds a pointer to the C++ Java type wrapper and a Javajvalue union. We will discuss this object further in the CPython section.
Bootstrapping¶
The most challenging part in working with the jpype module other than theneed to support both major Python versions with the same codebase is thebootstrapping of resources. In order to get the system working, we must passthe Python resources so the _jpype
CPython module can acquire resources and thenconstruct the wrappers for java.lang.Object
and java.lang.Class
. The keydifficulty is that we need reflection to get methods from Java and thoseare part of java.lang.Class
, but class inherits from java.lang.Object
.Thus Object and the interfaces that Class inherits must all be createdblindly. The order of bootstrapping is controlled by specific sequenceof boot actions after the JVM is started in startJVM
. The class instanceclass_
may not be accessed until after all of the basic class, object,and exception types have been loaded.
Factories¶
The key objects exposed to the user (JClass
, JObject
, and JArray
) are eachfactory meta classes. These classes serve as the gate keepers to creating themeta classes or object instances. These factories inherit from the Java class metaand have a class_
instance inserted after the the JVM is started. They do nothave exposed methods as they are shadows for action for actual Java types.
The user calls with the specified arguments to create a resource. The factorycalls the __new__
method when creating an instance of the derived object. Andthe C++ wrapper calls the method with internally construct resource such as_JClass
or _JValue
. Most of the internal calls currently create theresource directly without calling the factories. The gateway for this isPyJPValue_create
which delegates the process to the corresponding specializedtype.
Style¶
Python Api Wrapper
One of the aspects of the jpype design is elegance of the factory patterns.Rather than expose the user a large number of distinct concepts with differentnames, the factories provide powerfull functionality with the same syntax forrelated things. Boxing a primitive, casting to a specific type, and creatinga new object are all tied together in one factory, JObject
. By also making thatfactory an effective base class, we allow it to be used for issubtype
andisinstance
.
This philosophy is further enhanced by silent customizers which integratePython functionality into the wrappers such that Java classes can be usedeffectively with Python syntax. Consistent use and misuse of Python conceptssuch as with
for defining blocks such as try with resources and synchronizedhide the underlying complexity and give the feeling to the user that themodule is integrated completely as a solution such as jython.
When adding a new feature to the Python layer, consider carefully if thefeature needs to be exposed a new function or if it can be hidden in thenormal Python syntax.
JPype does somewhat break the Python naming conventions. Because Java andPython have very different naming schemes, at least part of the kit wouldhave a different convention. To avoid having one portion break Python conventionsand another part conform, we choose to use Java notation consistentlythroughout. Package names should be lower with underscores, classes shouldcamel case starting upper, functions and method should be camel case startinglower. All private methods and classes start with a leading underscoreand are not exported.
Customizers¶
There was a major change in the way the customizers work between versions.The previous system was undocumented and has now been removed, but assomeone may have used of it previously, we will contrast it with therevised system so that the customizers can be converted.
In the previous system, a global list stored all customizers.When a class was created, it went though the list and asked the class ifit matched that class name. If it matched, it altered the dict of membersto be created so when the dynamic class was finished it had the customebehavior. This system wasn’t very scalable as each customizer added morework to the class construction process.
The revised system works by storing a dictionary keyed to the class name.Thus the customizer only applies to the specific class targeted to thecustomizer. The customizer is specified using annotation of a prototypeclass making methods automatically copy onto the class. However, sometimesa customizer needs to be applied to an entire tree of classes such asall classes that implement java.util.List
. To handle this case,the class creation system looks for a special method __java_init__
in the tree of base classes and calls it on the newly created class.Most of the time the customization was the same simple pattern so weadded a sticky
flag to build the initialization method directly.This method can alter the class to make it add the new behavior. Notethe word alter. Where before we changed the member prior to creating theclass, here we are altering the class. Thus the customizer is expectedto monkey patch the existing class. There is only one pattern ofmonkey patching that works on both Python 2 and Python 3 so be sure touse the type.__setattr__
method of altering the class dictionary.
It is possible to apply customizers after the class has already beencreated because we operate by monkey patching. But there is a limitationthat there can only be one __java_init__
method and thus twocustomizers specifying a global behavior on the same class wrapper willlead to unexpected behavior.
_jpype
CPython module¶
Diving deeper into the onion, we have the Python front end. This is dividedinto a number of distinct pieces. Each piece is found under native/python
and is named according to the piece it provides. For example,PyJPModule
is found in the file native/python/pyjp_module.cpp
Earlier versions of the module had all of the functionality in themodules global space. This functionality is now split into a numberof classes. These classes each have a constructor that is used to createan instance which will correspond to a Java resource such as class, array,method, or value.
Jpype objects work with the inner layers by inheriting from a set of special_jpype
classes. This class hiarachy is mantained by the meta class_jpype._JClass
. The meta class does type hacking of the Python APIto insert a reserved memory slot for the JPValue
structure. The metaclass is used to define the Java base classes:
_JClass
- Meta class for all Java types which maps to a java.lang.Classextending Python type._JArray
- Base class for all Java array instances._JObject
- Base type of all Java object instances extending Python object._JNumberLong
- Base type for integer style types extending Python int._JNumberFloat
- Base type for float style types extending Python float._JNumberChar
- Special wrapper type for JChar and java.lang.Charactertypes extending Python float._JException
- Base type for exceptions extending Python Exception._JValue
- Generic capsule representing any Java type or instance.
These types are exposed to Python to implement Python functionality specificto the behavior expected by the Python type. Under the hood these types arelargely ignored. Instead the internal calls for the Java slot to determinehow to handle the type. Therefore, internally often Python methods will beapplied to the “wrong” type as the requirement for the method can be satisfiedby any object with a Java slot rather than a specific type.
See the section regarding Java slots for details.
PyJPModule
module¶
This is the front end for all the global functions required to support thePython native portion. Most of the functions provided in the module arefor control and auditing.
Resources are created by setting attributes on the _jpype
moduleprior to calling startJVM
. When the JVM is started each of threquired resources are copied from the module attribute lists to themodule internals. Setting the attributes after the JVM is started hasno effect. Resources are verified to exist when the JVM is startedand any missing resource are reported as an error.
_JClass
class¶
The class wrappers have a metaclass _jpyep._JClass
which serves asthe guardian to ensure the slot is attached, provide for the inheritancechecks, and control access to static fields and methods. The slot holdsa java.lang.Class instance but it does not have any of the methods normallyassociate with a Java class instance exposed. A java.lang.Class instancecan be converted to a Jave class wrapper using JClass
.
_JMethod
class¶
This class acts as descriptor with a call method. As a descriptor accessing itsmethods through the class will trigger its __get__
function, thusgetting ahold of it within Python is a bit tricky. The __get__
mathodis used to bind the static unbound method to a particular object instanceso that we can call with the first argument as the this
pointer.
It has some reflection and diagnostics methods that can be usefulit tracing down errors. The beans methods are there just to supportthe old properties API.
The naming on this class is a bit deceptive. It does not correspondto a single method but rather all the overloads with the same name.When called it passes to with the arguments to the C++ layer whereit must be resolved to a specific overload.
This class is stored directly in the class wrappers.
_JField
class¶
This class is a descriptor with __get__
and __set__
methods.When called at the static class layer it operates on static fields. Whencalled on a Python object, it binds to the object making a this
pointer.If the field is static, it will continue to access the static field, otherwise,it will provide access to the member field. This trickery allows bothstatic and member fields to wrap as one type.
This class is stored directly in the class wrappers.
_JArray
class¶
Java arrays are extensions of the Java object type. It has both methods associatedwith java.lang.Object and Python array functionality. Primitives havespecialized implementations to allow for the Python buffer API.
_JMonitor
class¶
This class provides synchronized
to JPype. Instances of thisclass are created and held using with
. It has two methods__enter__
and __exit__
which hook into the Python RAIIsystem.
_JValue
class¶
Java primitive and object instance derive from special Python derivedtypes. These each have the Python functionality to be exposed anda Java slot. The most generic of these is _JValue
which is simplya capsule holding the Java C++ type wrapper and a Java jvalue union.CPython methods for the PyJPValue
apply to all CPython objectsthat hold a Java slot.
Specific implementation exist for object, numbers, characters, andexceptions. But fundimentally all are treated the same internallyand thus the CPython type is effectively erased outside of Python.
Unlike jvalue
we hold the object type in the C++ JPValue
object. The class reference is used to determine how to match the argumentsto methods. The class may not correspond to the actual class of theobject. Using a class other than the actual class serves to allowan object to be cast and thus treated like another type for the purposesof overloading. This mechanism is what allows the JObject
factoryto perform a typecast to make an object instance act like one of itsbase classes..
Java Slots¶
THe key to achieving reasonable speed within CPython is the use of slots.A slot is a dedicated memory location that can be accessed without consultingthe dictionary or bases of an object. CPython achieve this by reserving spacewithin the type structure and by using a set of bit flags so that it can avoid costly.The reserved space in order by number and thus avoids the need to access thedictionary while the bit flags serve to determine the type without traversingthe __mro__
structure. We had to implement the same effect which derivingfrom a wide variety for Python types including type, object, int, long, andException. Adding the slot directly to the type and objects base memorydoes not work because these types all have different memory layouts. We couldhave a table look up based on the type but because we must obey both the CPythonand the Java object hierarchy at the same time it cannot be done within thememory layout of Python objects. Instead we have to think outside the box,or rather outside the memory footprint of Python objects.
CPython faces the same conflict internally as inheritance often forces addinga dictionary or weak reference list onto a variably size type sych as long.For those cases it adds extract space to the basesize of the object and thenignores that space for the purposes of checking inheritance. It pairs thiswith an offset slot that allows for location of the dynamic placed slots.We cannot replicate this in the same way because the CPython interals areall specialize static members and there is no provision for introductinguser defined dynamic slots.
Therefore, instead we will add extra memory outside the view of Pythonobjects though the use of a custom allocator. We intercept the call tocreate an object allocation and then call the regular Python allocatorswith the extra memory added to the request. As our extrs slot hasresource in the form of Java global references associated with it, wemust deallocate those resource regardless of the type that has beenextended. We perform this task by creating a custom finalize method toserve as the destructor. Thus a Java slot requiresoverriding each of tp_alloc
, tp_free
and tp_finalize
. Theclass meta gatekeeper creates each type and verifies that the requiredhooks are all in place. If the user tries to bypass this it shouldproduce an error.
In place of Python bit flags to check for the presence of a Java slotwe instead test the slot table to see if our hooks are in place.We can test if the slot is present by looking to see if both tp_alloc andtp_finalize point to our Java slot handlers. This means we are stilleffectively a slot as we can test and access with O(1).
Accessing the slot requires testing if the slot exists for the object,then computing the sice of the object using the basesize and itemsizeassociate with the type and then offsetting the Python object pointerappropriately. The overall cost is O(1), though is slightly moreheavy that directly accesssing an offset.
CPython API layer¶
To make creation of the C++ layer easier a thin wrapper over the CPython API wasdeveloped. This layer provided for handling the CPython referencing using asmart pointer, defines the exception handling for Python, and provides resourcehooks for duck typing of the _jpype
classes.
This layer is located with the rest of the Python codes in native/python
, buthas the prefix JPPy
for its classes. As the bridge between Python and C++,these support classes appear in both the _jpype
CPython module and the C++JNI layer.
Exception handling¶
A key piece of the jpype interaction is the transfer of exceptions fromJava to Python. To accomplish this Python method that can result in a call toJava must have a try
block around the contents of the function.
We use a routine pattern of code to interact with Java to achieve this:
All entry points from Python into _jpype
should be guarded with this pattern.
There are exceptions to this pattern such as removing the logging, operating ona call that does not need the JVM running, or operating where the frame isalready supported by the method being called.
Python referencing¶
One of the most miserable aspects of programming with CPython is the relativeinconsistancy of referencing. Each method in Python may use a Python object or stealit, or it may return a borrowed reference or give a fresh reference. Similarcommand such as getting an element from a list and getting an element from a tuplecan have different rules. This was a constant source of bugs requiringconsultation of the Python manual for every line of code. Thus we wrapped all of thePython calls we were required to work with in jp_pythontypes
.
Included in this wrapper is a Python reference counter called JPPyObject
.Whenever an object is returned from Python it is immediately placed in smartpointer JPPyObject
with the policy that it was created with such asuse_
, borrowed_
, claim_
or call_
.
use_
- This policy means that the reference counter needs to be incremented and the startand the end. We must reference it because if we don’t and some Python calldestroys the refernce out from under us, the system may crash and burn.
borrowed_
- This policy means we were to be give a borrowed reference that we are expectedto reference and unreference when complete, but the command that returned itcan fail. Thus before reference it, the system must check if an error hasoccurred. If there is an error, it is promoted to an exception.
claim_
- This policy is used when we are given a new object with is already referencedfor us. Thus we are to steal the reference for the duration of our use andthen dereference when we are done to keep it from leaking.
call_
- This policy both steals the reference and verifies there were no errorsprior to continuing. Errors are promoted to exceptions when this referenceis created.
If we need to pass an object which is held in a smart pointer to Pythonwhich requires a reference, we call keep on the reference which transferscontrol to a PyObject*
and prevents the pointer from removing the reference.As the object handle is leaving our control keep should only be called thereturn statement. The smart pointer is not used on method passing in whichthe parent explicitly holds a reference to the Python object. As all tuplespassed as arguments operate like this, that means much of the API acceptsbare PyObject*
as arguments. It is the job of the caller to hold thereference for its scope.
On CPython extensions¶
CPython is somewhat of a nightmare to program in. It is not that they did nottry to document the API, but it is darn complex. The problems extend wellbeyond the reference counting system that we have worked around. Inparticular, the object model though well developed is very complex, often toget it to work you must follow letter for letter the example on the CPythonuser guide, and even then it may all go into the ditch.
The key problem is that there are a lot of very bad examples of how to writeCPython extension modules out there. Often the these examples bypass theappropriate macro and just call the field, or skip the virtual table and try tocall the Python method directly. It is true that these things do not breakthere example, but they are conditioned on these methods they are callingdirectly to be the right one for the job, but depends a lot on what thebehavior of the object is supposed to be. Get it wrong and you get really nastysegfault.
CPython itself may be partly responsible for some of these problems. Theygenerally seem to trust the user and thus don’t verify if the call makes sense.It is true that it will cost a little speed to be aggressive about checking thetype flags and the allocator match, but not checking when the error happens,means that it fails far from the original problem source. I would hope that wehave moved beyond the philosophy that the user should just to whatever theywant so it runs as fast as possible, but that never appears to be the case. Ofcourse, I am just opining from the outside of the tent and I am sure the issuesare much more complicated it appears superficially. Then again if I can manageto provide a safe workspace while juggling the issues of multiple virtualmachines, I am free to have opinions on the value of trading performance andsafety.
In short when working on the extension code, make sure you do everything by thebook, and check that book twice. Always go through the types virtual table anduse the propery macros to access the resources. Miss one line in some complexpattern even once and you are in for a world of hurt. There are very few guardrails in the CPython code.
C++ JNI layer¶
The C++ layer has a number of tasks. It is used to load thunks, call JNImethods, provide reflection of classes, determine if a conversion is possible,perform conversion, match arguments to overloads, and convert return valuesback to Java.
Memory management¶
Java provides built in memory management for controlling the lifespan ofJava objects that are passed through JNI. When a Java object is createdor returned from the JVM it returns a handle to object with a referencecounter. To manage the lifespan of this reference counter a local frameis created. For the duration of this frame all local references willcontinue to exist. To extend the lifespan either a new global referenceto the object needs to be created, or the object needs to be kept. Whenthe local frame is destroyed all local references are destroyed withthe exception of an optional specified local return reference.
We have wrapped the Java reference system with the wrapper JPLocalFrame
.This wrapper has three functions. It acts as a RAII (Resource acquisitionis initialization) for the local frame. Further, as creating a localframe requires creating a Java env reference and all JNI calls requireaccess to the env, the local frame acts as the front end to call allJNI calls. Finally as getting ahold of the env requires that thethread be attached to Java, it also serves to automatically attachthreads to the JVM. As accessing an unbound thread will cause a segmentationfault in JNI, we are now safe from any threads created from withinPython even those created outside our knowledge. (I am looking atyou spyder)
Using this pattern makes the JPype core safe by design. Forcing JNIcalles to be called using the frame ensures:
- Every local reference is destroyed.
- Every thread is properly attached before JNI is used.
- The pattern of keep only one local reference is obeyed.
To use a local frame, use the pattern shown in this example.
Note that the value of the object returned and the object in the functionwill not be the same. The returned reference is owned by the enclosinglocal frame and points to the same object. But as its lifespan belongsto the outer frame, its location in memory is different. You are allowedto keep
a reference that was global or was passed in, in either ofthose case, the outer scope will get a new local reference that pointsto the same object. Thus you don’t need to track the origin of the object.
The changing of the value while pointing is another common problem.A routine error is to get a local reference, call NewGlobalRef
and then keeping the local reference rather than the shiny newglobal reference it made. This is not like the Python reference systemwhere you have the object that you can ref and unref. Thus make sureyou always store only the global reference.
Slot Wrapper Python Online
But don’t mistake this as an invitation to make global references everywhere.Global reference are global, thus will hold the member until the reference isdestroyed. C++ exceptions can lead to missing the unreference, thus globalreferences should only happen when you are placing the Java object into a classmember variable or a global variable.
To help manage global references, we have JPRef<>
which holds a globalreference for the duration of the C++ lifespace. This is the base class foreach of the global reference types we use.
For functions that expect the outer scope to already have created a framefor this context, we use the pattern of extending the outer scope ratherthan creating a new one.
Although the system we have set up is “safe by design”, there are things thatcan go wrong is misused. If the caller fails to create a frame prior tocalling a function that returns a local reference, the reference will go intothe program scoped local references and thus leak. Thus, it is usually best toforce the user to make a scope with the frame extension pattern. Second, if anyJNI references that are not kept or converted to global, it becomes invalid.Further, since JNI recycles the reference pointer fairly quickly, it mostlikely will be pointed to another object whose type may not be expected. Thus,best case is using the stale reference will crash and burn. Worse case, thereference will be a live reference to another object and it will produce anerror which seems completely irrelevant to anything that was being called.Horrible case, the live object does not object to bad call and it all silentlyproceeds down the road another two miles before coming to flaming death.
Moral of the story, always create a local frame even if you are handling a globalreference. If passed or returned a reference of any kind, it is a borrowed referencebelonging to the caller or being held by the current local frame. Thus it mustbe treated accordingly. If you have to hold a global use the appropraite JPRef
class to ensure it is exception and dtor safe. For further informationread native/common/jp_javaframe.h
.
Type wrappers¶
Each Java type has a C++ wrapper class. These classes provide a number of methods.Primitives each have their own unit type wrapper. Object, arrays, and classinstances share a C++ wrapper type. Special instances are used forjava.lang.Object
and java.lang.Class
. The type wrapper are named for the classthey wrap such as JPIntType
.
Type conversion¶
For type conversion, a C++ class wrapper provides four methods.
canConvertToJava
- This method must consult the supplied Python object to determine the typeand then make a determination of whether a conversion is possible.It reports
none_
if there is no possible conversion,explicit_
if theconversion is only acceptable if forced such as returning from a proxy,implicit_
if the conversion is possible and acceptable as part of anmethod call, orexact_
if this type converts without ambiguity. It is exceptedto check for something that is already a Java resource of the correct typesuch asJPValue
, or something this is implementing the behavior as an interfacein the form of aJPProxy
. convertToJava
- This method consults the type and produces a conversion. The order of the matchshould be identical to the
canConvertToJava
. It should also handle values andproxies. convertToPythonObject
- This method takes a jvalue union and converts it to the correspondingPython wrapper instance.
getValueFromObject
- This converts a Java object into a
JPValue
corresponding. This unboxesprimitives.
Array conversion¶
In addition to converting single objects, the type rewrappers also serve as thegateway to working with arrays of the specified type. Five methods are used towork with arrays: newArrayInstance
, getArrayRange
, setArrayRange
,getArrayItem
, and setArrayItem
.
Python Wrapper Example
Invocation and Fields¶
To convert a return type produced from a Java call, each type needs to beable to invoke a method with that return type. This corresponses the underlyingJNI design. The methods invoke and invokeStatic are used for this purpose.Similarly accessing fields requires type conversion using the methodsgetField
and setField
.
Slot Wrapper Python Example
Instance versus Type wrappers¶
Instances of individual Java classes are made from JPClass
. However, twospecial sets of conversion rules are required. These are in the formof specializations JPObjectBaseClass
and JPClassBaseClass
correspondingto java.lang.Object
and java.lang.Class
.
Support classes¶
In addition to the type wrappers, there are several support classes. These are:
JPTypeManager
- The typemanager serves as a dict for all type wrappers created during theoperation.
JPReferenceQueue
- Lifetime manager for Java and Python objects.
JPProxy
- Proxies implement a Java interface in Python.
JPClassLoader
- Loader for Java thunks.
JPEncoding
- Decodes and encodes Java UTF strings.
JPTypeManager
¶
C++ typewrappers are created as needed. Instance of each of theprimitives along with java.lang.Object
and java.lang.Class
are preloaded.Additional instances are created as requested for individual Java classes.Currently this is backed by a C++ map of string to class wrappers.
The typemanager provides a number lookup methods.
JPReferenceQueue
¶
When a Python object is presented to Java as opposed to a Java object, thelifespan of the Python object must be extended to match the Java wrapper.The reference queue adds a reference to the Python object that will beremoved by the Java layer when the garbage collection deletes the wrapper.This code is almost entirely in the Java library, thus only the portionto support Java native methods appears in the C++ layer.
Once started the reference queue is mostly transparent. registerRef is usedto bind a Python object live span to a Java object.
JPProxy
¶
In order to call Python functions from within Java, a Java proxy is used. Themajority of the code is in Java. The C++ code holds the Java native portion.The native implement of the proxy call is the only place in with the patternfor reflecting Python exceptions back into Java appears.
As all proxies are ties to Python references, this code is strongly tied tothe reference queue.
JPClassLoader
¶
This code is responsible for loading the Java class thunks. As it is difficultto ensure we can access a Java jar from within Python, all Java native codeis stored in a binary thunk compiled into the C++ layer as a header. Theclass loader provides a way to load this embedded jar first by bootstrappinga custom Java classloader and then using that classloader to load the internaljar.
The classloader is mostly transparent. It provides one method called findClasswhich loads a class from the internal jar.
JPEncoding
¶
Java concept of UTF is pretty much out of sync with the rest of the world. Javaused 16 bits for its native characters. But this was inadequate for all of theunicode characters, thus longer unicode character had to be encoded in the 16bit space. Rather the directly providing methods to convert to a standardencoding such as UTF8, Java used UTF16 encoded in 8 bits which they dubModified-UTF8. JPEncoding
deals with converting this unusual encoding intosomething that Python can understand.
The key method in this module is transcribe with signature
There are two encodings provided, JPEncodingUTF8
and JPEncodingJavaUTF8
.By selecting the source and traget encoding transcribe can convert to orfrom Java to Python encoding.
Incidentally that same modified UTF coding is used in storing symbols in theclass files. It seems like a really poor design choice given they have to documentthis modified UTF in multiple places. As far as I can tell the internalconverter only appears on java.io.DataInput
and java.io.DataOutput
.
Java native code¶
How To Use Python Wrapper
At the lowest level of the onion is the native Java layer. Although this layeris most remote from Python, ironically it is the easiest layer to communicatewith. As the point of jpype is to communicate with Java, it is possible todirectly communicate with the jpype Java internals. These can be imported fromthe package org.jpype
. The code for the Java layer is located innative/java
. It is compiled into a jar in the build directory and thenconverted to a C++ header to be compiled into the _jpype
module.
The Java layer currently houses the reference queue, a classloader which canload a Java class from a bytestream source, the proxy code for implementingJava interfaces, and a memory compiler module which allows Python to directlycreate a class from a string.
Tracing¶
Python Wrapper Function
Because the relations between the layers can be daunting especially when thingsgo wrong. The CPython and C++ layer have a built in logger. This loggermust be enabled with a compiler switch to activate. To active the logger, touchone of the cpp files in the native directory to mark the build as dirty, thencompile the jpype
module with:
Once built run a short test program that demonstrates the problem and capture theoutput of the terminal to a file. This should allow the developer to isolatethe fault to specific location where it failed.
To use the logger in a function start the JP_TRACE_IN(function_name)
which willopen a trycatch
block.
The JPype tracer can be augmented with the Python tracing module to givea very good picture of both JPype and Python states at the time of the crash.To use the Python tracing, start Python with…
Debugging issues¶
If the tracing function proves inadequate to identify a problem, we often needto turn to a general purpose tool like gdb or valgrind. The JPype core is noteasy to debug. Python can be difficult to properly monitor especially withtools like valgrind due to its memory handling. Java is also challenging todebug. Put them together and you have the mother of all debugging issues. Thereare a number of complicating factors. Let us start with how to debug with gdb.
Gdb runs into two major issues, both tied to the signal handler.First, Java installs its own signal handlers that take over the entire processwhen a segfault occurs. This tends to cause very poor segfault stacktraceswhen examining a core file, which often is corrupt after the first user frame.Second, Java installs its signal handlers in such as way that attempting to rununder a debugger like gdb will often immediately crash preventing one fromcatching the segfault before Java catches it. This makes for a catch 22,you can’t capture a meaningful non-interactively produced core file, and youcan’t get an interactive session to work.
Fortunately there are solutions to the interactive session issue. By disablingthe SIGSEGV handler, we can get past the initial failure and also we can catchthe stack before it is altered by the JVM.
Python Method Wrapper
Thus far I have not found any good solutions to prevent the JVM from alteringthe stack frames when dumping the core. Thus interactive debugging appearsto be the best option.
There are additional issues that one should be aware of. Open-JDK 1.8 has had anumber of problems with the debugger. Starting JPype under gdb may trigger, maytrigger the following error.
There are supposed to be fixes for this problem, but none worked for me.Upgrading to Open-JDK 9 appears to fix the problem.
Another complexity with debugging memory problems is that Python tends tohide the problem with its allocation pools. Rather than allocating memorywhen a new object is request, it will often recycle and existing objectwhich was collect earlier. The result is that an object which turns out isstill live becomes recycled as a new object with a new type. Thus suddenlya method which was expected to produce some result instead vectors intothe new type table, which may or may not send us into segfault landdepending on whether the old and new objects have similar memory layouts.
This can be partially overcome by forcing Python to use a different memoryallocation scheme. This can avoid the recycling which means we are more likelyto catch the error, but at the same time means we will be excuting differentcode paths so we may not reach a similar state. If the core dump is vectoringoff into code that just does not make sense it is likely caused by the memorypools. Starting Python 3, it is possible to select the memory allocation policythrough an enviroment variable. See the PYTHONMALLOC
setting for details.
Future directions¶
Although the majority of the code has been reworked for JPype 0.7, there is stillfurther work to be done. Almost all Java constructs can be exercised from withinPython, but Java and Python are not static. Thus, we are working on furtherimprovements to the jpype core focusing on making the package faster, moreefficient, and easier to maintain. This section will discuss a few of these options.
Java based code is much easier to debug as it is possible to swap the thunk codewith an external jar. Further, Java has much easier management of resources.Thus pushing a portion of the C++ layer into the Java layer could further reducethe size of the code base. In particular, deciding the order of search formethod overloads in C++ attempts to reconstruct the Java overload rules. But thesesame rules are already available in Java. Further, the C++ layer is designedto make many frequent small calls to Java methods. This is not the preferredmethod to operate in JNI. It is better to have specialized code in Java whichpreforms large tasks such as collecting all of the fields needed for a typewrapper and passing it back in a single call, rather than call twenty differentgeneral purpose methods. This would also vastly reduce the number of jmethods
that need to be bound in the C++ layer.
The world of JVMs is currently in flux. Jpype needs to be able to supportother JVMs. In theory, so long a JVM provides a working JNI layer, thereis no reason the jpype can’t support it. But we need loading routines forthese JVMs to be developed if there are differences in getting the JVMlaunched.
There is a project page on github shows what is being developed for thenext release. Series 0.6 was usable, but early versions had notable issueswith threading and internal memory management concepts had to be redone forstability. Series 0.7 is the first verion after rewrite forsimplication and hardening. I consider 0.7 to be at the level of productionquality code suitable for most usage though still missing some neededfeatures. Series 0.8 will deal with higher levels of Python/Java integration such as Javaclass extension and pickle support. Series 0.9 will be dedicated to anyadditional hardening and edge cases in the core code as we should have completeintegration. Assuming everything is completed, we will one day become areal boy and have a 1.0 release.