Systematic approach with Maven to deal with dependency hell

北城余情 提交于 2019-11-27 13:53:40

问题


I'm struggling with how to approach jar dependency hell. I have a Maven-IntelliJ Scala project that uses some aws sdk's. Recently adding the kinesis sdk has introduced incompatible versions of Jackson.

My question is: how do I systemically approach the problem of Jar hell?

I understand class loaders and how maven chooses between duplicate Jars, but I am still at a loss regarding actual practical steps to fix the issue.

My attempts at the moment are based on trial and error, and I am outlining the here with the Jackson example:

  • First, I see what the actual exception is, in this case NoSuchMethodError, on the Jackson data bindings ObjectMapper class. I then look at the Jackson docs to see when the method was added or removed. This is usually quite tedious, as I manually check the api docs for each version (question 1: is there a better way?).
  • Then, I use mvn dependency:tree to figure out which version of the Jackson I am actually using (question 2: is there an automatic way of asking maven which version of a jar is in use, rather than combing through the tree output?).
  • Finally, I compare the mvn dependency:tree output before adding the Kinesis SDK, and after, to detect differences in the mvn dependency:tree output, and hopefully see if the Jackson version changed. (question 3: How does maven use the libraries in shaded jars, when dependency resolution occurs? Same as any other?).

Finally, after comparing the tree outputs, I try to add the lastest working version of Jackson explicitly in the POM, to trigger precedence in the maven dependency resolution chain. If the latest does not work, I add the next most recent lib, and so forth.

This entire procedure is incredibly tedious. Besides the specific questions I asked, I am also curious about other people's systemic approaches to this problem. Does any one have any resources that they use?


回答1:


I then look at the Jackson docs to see when the method was added or removed. This is usually quite tedious, as I manually check the api docs for each version (question 1: is there a better way?)

To check API (breaking) compatibility there are several tools which would automatically analyze jars and provide you the right information. From this Stack Overflow post there are nice hints for some handy tools.
JAPICC seems quite good.

Then, I use mvn dependency:tree to figure out which version of the Jackson I am actually using (question 2: is there an automatic way of asking maven which version of a jar is in use, rather than combing through the tree output?)

The maven-dependency-tree is definitely the way to go, but you can filter out since the beginning the scope and only get what you are actually looking for, using its includes option as following:

mvn dependency:tree -Dincludes=<groupId>

note: you can also provide further info to the includes option in the form groupId:artifactId:type:version or use wildcards like *:artifactId.

It seems a small hint, but in large projects with many dependencies narrowing down its output is of great help. Normally, simply the groupId should be enough as a filter, the *:artifactId is probably the fastest though if you are looking for a specific dependency.

If you are interested in a list of dependencies (not as a tree) also alphabetically ordered (quite handy in many scenarios), then the following may also help:

mvn dependency:list -Dsort=true -DincludeGroupIds=groupId

question 3: How does maven use the libraries in shaded jars, when dependency resolution occurs? Same as any other?

By shaded jars you may mean:

  • fat jars, which also bring it other jars into the classpath. In this case, they are seen as one dependency, one unit for Maven Dependency Mediation, its content would then be part of the project classpath. In general, you shouldn't have fat-jars as part of your dependencies since you don't have control over packed libraries it brings in.
  • jars with shaded (renamed) packages. In this case - again - there is no control as far as Maven Dependency Mediation is concerned: it's one unit, one jar, based on its GAVC (GroupId, ArtifactId, Version, Classifier) which makes it unique. Its content then it's added to the project classpath (according to the dependency scope, but since its package was renamed, you may have conflicts difficult to handle with. Again, you shouldn't have renamed packages as part of your project dependencies (but often you can't know that).

Does any one have any resources that they use?

In general, you should understand well how Maven handles dependencies and use the resources it offers (its tools and mechanisms). Below some important points:

  • dependencyManagement is definitely the entry point in this topic: here you can deal with Maven Dependency Mediation, influence its decision on transitive dependencies, their versions, their scope. One important point is: what you add to dependencyManagement is not automatically added as a dependency. dependencyManagement is only taken into account once a certain dependency of the project (as declared in the pom.xml file or via transitive dependencies) has a matching with one of its entries, otherwise it would be simply ignored. It's an important part of the pom.xml since it helps on governing dependencies and their transitive graphs and that's why is often used in parent poms: you want to handle only one and in a centralized manner which version of, e.g., log4j you want to use in all of your Maven projects, you declare it in a common/shared parent pom and its dependencyManagement and you make sure it will be used as such. Centralization means better governance and better maintenance.
  • dependency section is important for declaring dependencies: normally, you should declare here only the direct dependencies you need. A good rule of thump is: declare here as compile (the default) scope only what you actually use as import statement in your code (but you often need to go beyond that, e.g., JDBC driver required at runtime and never referenced in your code, it would then be in runtime scope though). Also remember: the order of declaration is important: the first declared dependency wins in case of conflict against a transitive dependency, hence by re-declaring esplicitely a dependency you can effectively influence dependency mediation.
  • Don't abuse with exclusions in dependencies to handle transitive dependencies: use dependencyManagement and order of dependencies for that, if you can. Abuse of exclusions make maintenance much more difficult, use it only if you really need to. Also, when adding exclusions always add an XML comment explaining why: your team mates or/and your future self will appreciate.
  • Use dependencies scope thoughtfully. Use the default (compile) scope for what you really need to for compilation and testing (e.g. loga4j), use test only (and only) for what is used under test (e.g. junit), mind the provided scope for what is already provided by your target container (e.g. servlet-api), use the runtime scope only for what you need at runtime but you should never compile with it (e.g. JDBC drivers). Don't use the system scope since it would only imply troubles (e.g. it is not packaged with your final artifact).
  • Don't play with version ranges, unless for specific reasons and be aware that the version specified is a minimum requirements by default, the [<version>] expression is the strongest one, but you would rarely need it.
  • use Maven property as placeholder for the version element of families of libraries in order to make sure you have one centralised place for the versioning of a set of dependencies which would all have the same version value. A classic example would be a spring.version or hibernate.version property to use for several dependencies. Again, centralisation means better governance and maintenance, which also means less headache and less hell.
  • When provided, import BOM as an alternative to the point above and to better handle families of dependencies (e.g. jboss), delegating to another pom.xml file the management of a certain set of dependencies.
  • Don't (ab)use SNAPSHOT dependencies (or as less as possible). If you really need to, make sure you never release using a SNAPSHOT dependency: build reproducibility will be in high danger otherwise.
  • When troubleshooting, always check the full hierarchy of your pom.xml file, using help:effective-pom may be really useful while checking for effective dependencyManagement, dependencies and properties as far as the final dependency graph would be concerned.
  • Use some other Maven plugins to help you out in the governance. The maven-dependency-plugin is really helpful during troubleshooting, but also the maven-enforcer-plugin comes to help. Here are few examples worth to mention:

The following example will make sure that no one (you, your team mates, your future yourself) will be able to add a well-known test library in compile scope: the build will fail. It makes sure junit will never reach PROD (packaged with your war, e.g.)

<plugin>
    <artifactId>maven-enforcer-plugin</artifactId>
    <version>1.4.1<.version>
    <executions>
        <execution>
            <id>enforce-test-scope</id>
            <phase>validate</phase>
            <goals>
                <goal>enforce</goal>
            </goals>
            <configuration>
                <rules>
                    <bannedDependencies>
                        <excludes>
                            <exclude>junit:junit:*:*:compile</exclude>
                            <exclude>org.mockito:mockito-*:*:*:compile</exclude>
                            <exclude>org.easymock:easymock*:*:*:compile</exclude>
                            <exclude>org.powermock:powermock-*:*:*:compile</exclude>
                            <exclude>org.seleniumhq.selenium:selenium-*:*:*:compile</exclude>
                            <exclude>org.springframework:spring-test:*:*:compile</exclude>
                            <exclude>org.hamcrest:hamcrest-all:*:*:compile</exclude>
                        </excludes>
                        <message>Test dependencies should be in test scope!</message>
                    </bannedDependencies>
                </rules>
                <fail>true</fail>
            </configuration>
        </execution>
    </executions>
</plugin>

Have a look at other standard rules this plugin offers: many could be useful to break the build in case of wrong scenarios:

  • you can ban a dependency (even transitively), really handy in many cases
  • you can fail in case of SNAPSHOT used, handy in a release profile, as an example.

Again, a common parent pom could include more than one of these mechanisms (dependencyManagement, enforcer plugin, properties for dependency families) and make sure certain rules are respected. You may not cover all the possible scenarios, but it would definitely decrease the degree of hell you perceive and experience.




回答2:


Use Maven Helper plugin to easily resolve all conflict by excluding old versions of dependencies.




回答3:


In my experience I didn't found anything fully automated, but I found the following approach quite sistematic and useful for myself:

First of all I try to have a clear map of the project structure, relations between projects and I usually use Eclipse graphical dependency view, which tells me, for example, if a dependency is omitted for conflict with another one. Moreover it tells you the resolved dependencies for the project. I sincerely don't use IntelliJ IDEA but I believe it has a similar feature.

Usually I try to put very common dependency higher in the structure and I exploit the <dependencyManagement> feature to take care of the version for transitive dependencies, and most important, to avoid duplicates in the project structure.

In this Maven - Manage Dependencies blog post you can find a good tutorial about dependency management.

When adding a new dependency to my project , as in your case, I take care of where it is added in my project structure and make changes accordingly, but in most cases the dependency management mechanism is capable of deal with this problem.

In this Maven Best Practices blog post you can find:

Maven's dependencyManagement section allows a parent pom.xml to define dependencies that are potentially reused in child projects. This avoids duplication; without the dependencyManagement section, each child project has to define its own dependency and duplicate the version, scope, and type of the dependency.

Obviously if you need a particular version of a dependency for a project you can always specify the version you need locally, deep in the hierarchy.

I agree with you, it could be quite tedious, but dependency management could give you a good help.




回答4:


Even replacing all the jar with same name you can still have some classes with same fully qualified name. I used maven shade plugin in one of my project. It print classes with same qualified name coming from different jar. Maybe that can help you



来源:https://stackoverflow.com/questions/33907162/systematic-approach-with-maven-to-deal-with-dependency-hell

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!