Looks like its an issue where I have missed out. graphql already has some Graphql java extended scalars, so instead of manually creating an extension with DgsScalar (where the return type should be Object I think instead of String), it could directly use JSON or Object as type, which would be parsed well by dgs-client library.
so the change need is in pom.xml, include the corresponding version of `graphql-dgs-extended-scalars`
then in the schema, just import and define the type for the attributes
scalar JSON
type Position {
...
cashByCurrencyUnadjustedUsd: JSON
}
this is the only needed change. From the dgs-client, it should just work out.
Looks like this probably is an issue with graphql-java and graphql in general where Map is not supported type. I have tried to create a custom scalar type, something like
# in schema
scalar Map
# java code
@DgsScalar(name="Map")
public class DgsMap ....{
public **String** serialize (Object dataFetcherResult) ..{
return mapper.writeValueAsString(dataFetcherResult);
}
}
this then would result out
which Jackson from dgs-client still has issue to parse.
In the end, I have left dgs-framework to parse the Map as String using default Map.toString(). Then from dgs-client, I swithed to Gson as a custom deserializer to parse the String back to Map.
Not sure whether that’s the optimal approach, but seems like thats the only option I found working at the moment.
so basically, there is no change needed on the dgs framework side. However, from the client, when dgs-client parse the response, it would use Gson to parse the `Map.toString` value.
something like
@JsonProperties(ignoreUnknown=true)
public class POJO{ //on client side
....
@Deserialize(CustomDeserializer.class)
Map<> cashByCurrencyUnadjustedUsd;
}
then in the CustomDeserializer, it basically deserialize the string using gson.
public class CustomDeserializerextends StdDeserializer<Map> {
Gson gson = new Gson();
public CustomDeserializer() {
this(null);
}
public CustomDeserializer(Class<?> vc) {
super(vc);
}
@Override
public Map deserialize(JsonParser jp, DeserializationContext ctxt)
throws IOException, JsonProcessingException {
TextNode node = jp.getCodec().readTree(jp);
String data = node.textValue();
return gson.parse(data);
}
}
there is an issue with one of the python application, which was running a database select query wrapped with transaction.
looks like this is due to pyodbc by default get a database transaction with autocommit off. Instead, it could be explicitly to set the autocommit True when getting the connection object. Something like
is JOL, which similar to instrumentation, however, providing that capability as a library, provide a great insight on the class and objects of being potential culprit.
to look at the class, or the shallow size of the memory foot print
ClassLayout.parseClass(A.class).toPrintable()
this would give the shallow size, or the object its own memory consumption, (where object variable calculated with the size of the memory reference itself alone)
to get the deep size, (which is the object size itself plus the object size being referred by this object)
the application still face OOM issue after running for sometime. After analyzing the heapdump, turns out there is an issue with the protobuf message.
The application is subscribing two kafka topics, one publishing message in JSON format, the second in protobuf.
Both messages could contain up to 700 fields for each single message. This is not an issue with the consumer for the json topic. At the parse of the kafka binary message, it could ignore all unknown fields. This is a feature of jackson, but almost always turned on by default. Because the real feature is called FAIL_ON_UNKNOWN_PROPERTIES, it's actually intended for not breaking the parse with unknown.
with this turned on (it has to in order for the parser to work), it resulted in a good state where the unknown fields are not eating up the JVM heap.
Out of the ~700 fields, it’s only ~30 fields needed in the application at the moment. Not all kafka message contain all 700 fields, sometimes it could contain 100 fields, sometimes contain much more. But even in the case it send over 100 fields over wire, jackson could correctly discard the 70 fields without polluting the memory.
This is however not the case for protobuf, at least not the default set up.
So with the protobuf topic also onboarded to the application, even only 30 fields out of each message is really needed, it could introduce x2(at a minimum) times garbage onto heap. And the thing with protobuf is, because the 30 fields are really needed, the protobuf class would be kept on the heap, eventually in the old gen, and survive many gc cycles (even some mixed and full gc).
ultimately the solution is to leverage on a trick, where the Message could provide a parser, which then could be wrapped by `DiscardUnknownFieldParser`, which then could be used for parse the binary.
in addition, the ZGC comes with uncommit turned on by default, which would return the unused memory to host OS. so even though the application could have a very large Xmx, however, if it’s not used after gc, it would return that back to OS.
so with a docker allocated 450GB memory, the java application with 420GB Xmx, 180GB for Xms, its only consuming less than 30GB out of the OS.
I have an app which keep a lot data on heap. With G1GC, on and off the application would be killed with OOM by Docker, even though the G1GC has been tuned in many attempts, which likely has reached a pretty optimal level G1 can provide.
However, even with the tuning, G1 is still not triggered, neither full GC or major, even mixed GC, even when the heap usage already more than 90% specified by Xmx.
I didn’t explore much on it further, but guess a possible solution would be fixed to a more aggressive gc frequency or interval for G1. however, that would come with a cost for the application on both throughput and latency.
However, once switched to ZGC, with a small concurrent gc thread of 5, it’s able to keep the heap in good shape. Now gone the OOM.