micro-optimization

How could this Java code be sped up?

半城伤御伤魂 提交于 2020-01-02 07:58:32
问题 I am trying to benchmark how fast can Java do a simple task: read a huge file into memory and then perform some meaningless calculations on the data. All types of optimizations count. Whether it's rewriting the code differently or using a different JVM, tricking JIT .. Input file is a 500 million long list of 32 bit integer pairs separated by a comma. Like this: 44439,5023 33140,22257 ... This file takes 5.5GB on my machine. The program can't use more than 8GB of RAM and can use only a single

May there be any penalties when using 64/32-bit registers in Long mode?

断了今生、忘了曾经 提交于 2020-01-02 02:01:11
问题 Probably this is all about not even micro- but nanooptimizations, but the subject interests me and I would like to know if there are any penalties when using non-native register sizes in long mode? I've learned from various sources, that partial register updates (like ax instead of eax ) can cause eflags stall and degrade performance. But I'm not sure about the long mode. What register size is considered native for this processor operation mode? x86-64 are still extensions to x86 architecture

How to minimize the mutex locking for an object when only 1 thread mostly uses that object and the other thread(s) use it rarely?

≯℡__Kan透↙ 提交于 2020-01-01 12:32:32
问题 Scenario Suppose there are "Thread_Main" and "Thread_DB", with a shared SQLite database object. It's guaranteed that, "Thread_main" seldom uses SQLite object for reading (i.e. SELECT() ) "Thread_DB" uses the SQLite object most of the time for various INSERT , UPDATE , DELETE operations To avoid data races and UB, SQLite should be compiled with SQLITE_THREADSAFE=1 (default) option. That means, before every operation, an internal mutex will be locked, so that DB is not writing when reading and

What is faster in Python, “while” or “for xrange”

邮差的信 提交于 2020-01-01 11:36:08
问题 We can do numeric iteration like: for i in xrange(10): print i, and in C-style: i = 0 while i < 10: print i, i = i + 1 Yes, I know, the first one is less error-prone, more pythonic but is it fast enough as C-style version? PS. I'm from C++ planet and pretty new on Python one. 回答1: I am sure the while version is slower. Python will have to lookup the add operation for the integer object on each turn of the loop etc, it is not pure C just because it looks like it! And if you want a pythonic

Use of lazy val for caching string representation

ぃ、小莉子 提交于 2020-01-01 09:42:00
问题 I encountered the following code in JAXMag's Scala special issue: package com.weiglewilczek.gameoflife case class Cell(x: Int, y: Int) { override def toString = position private lazy val position = "(%s, %s)".format(x, y) } Does the use of lazy val in the above code provide considerably more performance than the following code? package com.weiglewilczek.gameoflife case class Cell(x: Int, y: Int) { override def toString = "(%s, %s)".format(x, y) } Or is it just a case of unnecessary

std::vector-like class optimized to hold a small number of items [duplicate]

别说谁变了你拦得住时间么 提交于 2020-01-01 07:35:34
问题 This question already has answers here : small string optimization for vector? (4 answers) Closed 4 years ago . In one time-critical part of the program there is a member of the class that looks like that: std::vector m_vLinks; During profiling I noticed that about 99.98% of executions this vector holds only 0 or 1 items. However in very rarely cases it might hold more. This vector is definitely a bottleneck according to profiler, so I'm thinking about following optimization: Craft a hand

Why are DateTime.Now DateTime.UtcNow so slow/expensive

自作多情 提交于 2020-01-01 03:58:05
问题 I realize this is way too far into the micro-optimization area, but I am curious to understand why Calls to DateTime.Now and DateTime.UtcNow are so "expensive". I have a sample program that runs a couple of scenarios of doing some "work" (adding to a counter) and attempts to do this for 1 second. I have several approached of making it do the work for a limited quantity of time. The examples show that DateTime.Now and DateTime.UtcNow are significantly slower than Environment.TickCount, but

How expensive is it to convert between int and double?

女生的网名这么多〃 提交于 2019-12-31 18:58:56
问题 I often see code that converts ints to doubles to ints to doubles and back once again (sometimes for good reasons, sometimes not), and it just occurred to me that this seems like a "hidden" cost in my program. Let's assume the conversion method is truncation. So, just how expensive is it? I'm sure it varies depending on hardware, so let's assume a newish Intel processor (Haswell, if you like, though I'll take anything). Some metrics I'd be interested in (though a good answer needn't have all

How expensive is it to convert between int and double?

微笑、不失礼 提交于 2019-12-31 18:57:45
问题 I often see code that converts ints to doubles to ints to doubles and back once again (sometimes for good reasons, sometimes not), and it just occurred to me that this seems like a "hidden" cost in my program. Let's assume the conversion method is truncation. So, just how expensive is it? I'm sure it varies depending on hardware, so let's assume a newish Intel processor (Haswell, if you like, though I'll take anything). Some metrics I'd be interested in (though a good answer needn't have all

How expensive is it to convert between int and double?

天涯浪子 提交于 2019-12-31 18:57:32
问题 I often see code that converts ints to doubles to ints to doubles and back once again (sometimes for good reasons, sometimes not), and it just occurred to me that this seems like a "hidden" cost in my program. Let's assume the conversion method is truncation. So, just how expensive is it? I'm sure it varies depending on hardware, so let's assume a newish Intel processor (Haswell, if you like, though I'll take anything). Some metrics I'd be interested in (though a good answer needn't have all