问题
This is the ignite(version 2.7.5) configuration that I am using for my 2-node PARTITIONED cluster.
<beans xmlns="http://www.springframework.org/schema/beans"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<!-- Enable annotation-driven caching. -->
<bean name="noOpFailureHandler" class="org.apache.ignite.failure.NoOpFailureHandler"/>
<bean id="ignite.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="peerClassLoadingEnabled" value="true"/>
<property name="igniteInstanceName" value="GridA"/>
<property name="clientMode" value="false"/>
<property name="failureDetectionTimeout" value="80000"/>
<property name="clientFailureDetectionTimeout" value="120000"/>
<property name="systemWorkerBlockedTimeout" value="30000" />
<property name="longQueryWarningTimeout" value="3000"/>
<property name="failureHandler" ref="noOpFailureHandler"/>
<property name="metricsLogFrequency" value="#{600 * 10 * 1000}"/>
<property name="rebalanceThreadPoolSize" value="16"/>
<property name="dataStorageConfiguration">
<bean class="org.apache.ignite.configuration.DataStorageConfiguration">
<!-- Redefining the default region's settings -->
<property name="pageSize" value="#{4 * 1024}"/>
<!--<property name="writeThrottlingEnabled" value="true"/>-->
<property name="defaultDataRegionConfiguration">
<bean class="org.apache.ignite.configuration.DataRegionConfiguration">
<property name="persistenceEnabled" value="true"/>
<property name="initialSize" value="#{105L * 1024 * 1024 * 1024}"/>
<property name="name" value="Default_Region"/>
<!--Setting the size of the default region to 4GB. -->
<property name="maxSize" value="#{120L * 1024 * 1024 * 1024}"/>
<property name="checkpointPageBufferSize"
value="#{4096L * 1024 * 1024}"/>
<!--<property name="pageEvictionMode" value="RANDOM_2_LRU"/>-->
</bean>
</property>
<property name="walPath" value="/wal/grid"/>
<property name="walArchivePath" value="/wal/grid/archive"/>
<property name="storagePath" value="/ignite/persistence"/>
<property name="checkpointFrequency" value="180000"/>
<property name="checkpointThreads" value="8"/>
<property name="walMode" value="BACKGROUND"/>
<property name="walSegmentSize" value="#{1L * 1024 * 1024 * 1024}"/>
<!--<property name="authenticationEnabled" value="true"/>-->
</bean>
</property>
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
<property name="multicastGroup" value="224.0.0.180"/>
<property name="multicastPort" value="47514"/>
</bean>
</property>
</bean>
</property>
<property name="communicationSpi">
<bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
<property name="messageQueueLimit" value="2048"/>
<property name="socketWriteTimeout" value="10000"/>
<property name="connectionsPerNode" value="10"/>
<property name="usePairedConnections" value="true"/>
<property name="socketReceiveBuffer" value="#{64L * 1024}"/>
</bean>
</property>
</bean>
</beans>
Ignite is started with the following JVM parameters:
/usr/java/jdk1.8.0_144/bin/java -XX:+AggressiveOpts -server -Xms20g -Xmx20g -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:+ScavengeBeforeFullGC -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/etappdata/ignite/logs/PROD/etail-prod-ignite76-164/logs -XX:+ExitOnOutOfMemoryError -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M -Xloggc:/etappdata/ignite/logs/PROD/etail-prod-ignite76-164/gc.log -XX:+PrintAdaptiveSizePolicy -XX:+UseTLAB -verbose:gc -XX:+ParallelRefProcEnabled -XX:+UseLargePages -XX:+AggressiveOpts -Djava.net.preferIPv4Stack=true -Djava.net.preferIPv4Addresses=true -Djava.net.preferIPv6Stack=false -Djava.net.preferIPv6Addresses=false -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=8996 -Dcom.sun.management.jmxremote.rmi.port=8996 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.local.only=false -Djava.rmi.server.hostname=etail-prod-ignite76-164 -XX:MaxDirectMemorySize=4g -javaagent:/tmp/apminsight-javaagent-prod/apminsight-javaagent.jar -Dfile.encoding=UTF-8 -XX:+UseG1GC -DIGNITE_QUIET=false -DIGNITE_SUCCESS_FILE=/ignite/apache-ignite-2.7.5-bin/work/ignite_success_0cbecd49-5b7f-4a41-b2f2-42bb66b2ea5c -Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=49128 -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -DIGNITE_HOME=/ignite/apache-ignite-2.7.5-bin -DIGNITE_PROG_NAME=./bin/ignite.sh -cp /ignite/apache-ignite-2.7.5-bin/libs/:/ignite/apache-ignite-2.7.5-bin/libs/ignite-indexing/:/ignite/apache-ignite-2.7.5-bin/libs/ignite-spring/:/ignite/apache-ignite-2.7.5-bin/libs/licenses/ org.apache.ignite.startup.cmdline.CommandLineStartup config/my-cache.xml
[Note: Each node has 210 GB RAM]
I am getting metrics like the following every 100 mins as mentioned in the config:
[00:33:36,452][INFO][grid-timeout-worker-#67%GridA%][IgniteKernal%GridA]
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=92dda713, name=GridA, uptime=01:40:00.019]
^-- H/N/C [hosts=10, nodes=10, CPUs=172]
^-- CPU [cur=2.13%, avg=2.16%, GC=0%]
^-- PageMemory [pages=5535967]
^-- Heap [used=6605MB, free=67.75%, comm=20480MB]
^-- Off-heap [used=21878MB, free=82.24%, comm=123179MB]
^-- sysMemPlc region [used=0MB, free=99.99%, comm=99MB]
^-- metastoreMemPlc region [used=0MB, free=99.77%, comm=99MB]
^-- Default_Region region [used=21878MB, free=82.2%, comm=122880MB]
^-- TxLog region [used=0MB, free=100%, comm=99MB]
^-- Ignite persistence [used=281575MB]
^-- sysMemPlc region [used=0MB]
^-- metastoreMemPlc region [used=unknown]
^-- Default_Region region [used=281575MB]
^-- TxLog region [used=0MB]
^-- Outbound messages queue [size=0]
^-- Public thread pool [active=0, idle=0, qSize=0]
^-- System thread pool [active=0, idle=6, qSize=0]
Q: What should I do to get more specific monitoring metrics? Is there any implication if I change the metricsLogFrequency to 1 min?
Should I add the following in the configuration file?
<!-- Enable metrics for this data region -->
<property name="metricsEnabled" value="true"/>
How can I see more monitoring metrics like pagesUsed, pagesReplaced, pagesFillFactor etc ?
Or should I add code in the client application like:
Ignite ignite = Ignition.ignite("GridA");
List<DataRegionMetrics> dataRegionMetricsList = new ArrayList<>(ignite.dataRegionMetrics());
dataRegionMetricsList.forEach(
dataRegionMetrics -> LOG.info(dataRegionMetrics.getName() + ": " + dataRegionMetrics.getAllocationRate() + ":"
+ dataRegionMetrics.getPagesFillFactor() + ":" + dataRegionMetrics.getPagesReplaceRate())
);
Please help!
来源:https://stackoverflow.com/questions/65125071/apache-ignite-2-7-5-monitoring-metrics