micro-optimization | 易学教程

How could this Java code be sped up?

阅读更多关于 How could this Java code be sped up?

问题 I am trying to benchmark how fast can Java do a simple task: read a huge file into memory and then perform some meaningless calculations on the data. All types of optimizations count. Whether it's rewriting the code differently or using a different JVM, tricking JIT .. Input file is a 500 million long list of 32 bit integer pairs separated by a comma. Like this: 44439,5023 33140,22257 ... This file takes 5.5GB on my machine. The program can't use more than 8GB of RAM and can use only a single

May there be any penalties when using 64/32-bit registers in Long mode?

阅读更多关于 May there be any penalties when using 64/32-bit registers in Long mode?

问题 Probably this is all about not even micro- but nanooptimizations, but the subject interests me and I would like to know if there are any penalties when using non-native register sizes in long mode? I've learned from various sources, that partial register updates (like ax instead of eax ) can cause eflags stall and degrade performance. But I'm not sure about the long mode. What register size is considered native for this processor operation mode? x86-64 are still extensions to x86 architecture

How to minimize the mutex locking for an object when only 1 thread mostly uses that object and the other thread(s) use it rarely?

阅读更多关于 How to minimize the mutex locking for an object when only 1 thread mostly uses that object and the other thread(s) use it rarely?

问题 Scenario Suppose there are "Thread_Main" and "Thread_DB", with a shared SQLite database object. It's guaranteed that, "Thread_main" seldom uses SQLite object for reading (i.e. SELECT() ) "Thread_DB" uses the SQLite object most of the time for various INSERT , UPDATE , DELETE operations To avoid data races and UB, SQLite should be compiled with SQLITE_THREADSAFE=1 (default) option. That means, before every operation, an internal mutex will be locked, so that DB is not writing when reading and

What is faster in Python, “while” or “for xrange”

阅读更多关于 What is faster in Python, “while” or “for xrange”

问题 We can do numeric iteration like: for i in xrange(10): print i, and in C-style: i = 0 while i < 10: print i, i = i + 1 Yes, I know, the first one is less error-prone, more pythonic but is it fast enough as C-style version? PS. I'm from C++ planet and pretty new on Python one. 回答1: I am sure the while version is slower. Python will have to lookup the add operation for the integer object on each turn of the loop etc, it is not pure C just because it looks like it! And if you want a pythonic

Use of lazy val for caching string representation

阅读更多关于 Use of lazy val for caching string representation

问题 I encountered the following code in JAXMag's Scala special issue: package com.weiglewilczek.gameoflife case class Cell(x: Int, y: Int) { override def toString = position private lazy val position = "(%s, %s)".format(x, y) } Does the use of lazy val in the above code provide considerably more performance than the following code? package com.weiglewilczek.gameoflife case class Cell(x: Int, y: Int) { override def toString = "(%s, %s)".format(x, y) } Or is it just a case of unnecessary

std::vector-like class optimized to hold a small number of items [duplicate]

阅读更多关于 std::vector-like class optimized to hold a small number of items [duplicate]

问题 This question already has answers here : small string optimization for vector? (4 answers) Closed 4 years ago . In one time-critical part of the program there is a member of the class that looks like that: std::vector m_vLinks; During profiling I noticed that about 99.98% of executions this vector holds only 0 or 1 items. However in very rarely cases it might hold more. This vector is definitely a bottleneck according to profiler, so I'm thinking about following optimization: Craft a hand

Why are DateTime.Now DateTime.UtcNow so slow/expensive

阅读更多关于 Why are DateTime.Now DateTime.UtcNow so slow/expensive

问题 I realize this is way too far into the micro-optimization area, but I am curious to understand why Calls to DateTime.Now and DateTime.UtcNow are so "expensive". I have a sample program that runs a couple of scenarios of doing some "work" (adding to a counter) and attempts to do this for 1 second. I have several approached of making it do the work for a limited quantity of time. The examples show that DateTime.Now and DateTime.UtcNow are significantly slower than Environment.TickCount, but

How expensive is it to convert between int and double?

阅读更多关于 How expensive is it to convert between int and double?

问题 I often see code that converts ints to doubles to ints to doubles and back once again (sometimes for good reasons, sometimes not), and it just occurred to me that this seems like a "hidden" cost in my program. Let's assume the conversion method is truncation. So, just how expensive is it? I'm sure it varies depending on hardware, so let's assume a newish Intel processor (Haswell, if you like, though I'll take anything). Some metrics I'd be interested in (though a good answer needn't have all

How expensive is it to convert between int and double?

阅读更多关于 How expensive is it to convert between int and double?

How expensive is it to convert between int and double?

阅读更多关于 How expensive is it to convert between int and double?