Python Get Screen Pixel Value in OS X

前端 未结 3 2072
醉梦人生
醉梦人生 2020-12-16 18:43

I\'m in the process of building an automated game bot in Python on OS X 10.8.2 and in the process of researching Python GUI automation I discovered autopy. The mouse manipul

相关标签:
3条回答
  • 2020-12-16 19:08

    This was all so very helpful I had to come back to comment / however I don't have the reputation.. I do, however, have a sample code of a combination of the answers above for a lightning quick screen capture / save thanks to @dbr and @qqg!

    import time
    import numpy as np
    from scipy.misc import imsave
    import Quartz.CoreGraphics as CG
    
    image = CG.CGWindowListCreateImage(CG.CGRectInfinite, CG.kCGWindowListOptionOnScreenOnly, CG.kCGNullWindowID, CG.kCGWindowImageDefault)
    
    prov = CG.CGImageGetDataProvider(image)
    _data = CG.CGDataProviderCopyData(prov)
    
    width = CG.CGImageGetWidth(image)
    height = CG.CGImageGetHeight(image)
    
    imgdata=np.fromstring(_data,dtype=np.uint8).reshape(len(_data)/4,4)
    numpy_img = imgdata[:width*height,:-1].reshape(height,width,3)
    imsave('test_fast.png', numpy_img)
    
    0 讨论(0)
  • 2020-12-16 19:09

    A small improvement, but using the TIFF compression option for screencapture is a bit quicker:

    $ time screencapture -t png /tmp/test.png
    real        0m0.235s
    user        0m0.191s
    sys         0m0.016s
    $ time screencapture -t tiff /tmp/test.tiff
    real        0m0.079s
    user        0m0.028s
    sys         0m0.026s
    

    This does have a lot of overhead, as you say (the subprocess creation, writing/reading from disc, compressing/decompressing).

    Instead, you could use PyObjC to capture the screen using CGWindowListCreateImage. I found it took about 70ms (~14fps) to capture a 1680x1050 pixel screen, and have the values accessible in memory

    A few random notes:

    • Importing the Quartz.CoreGraphics module is the slowest part, about 1 second. Same is true for importing most of the PyObjC modules. Unlikely to matter in this case, but for short-lived processes you might be better writing the tool in ObjC
    • Specifying a smaller region is a bit quicker, but not hugely (~40ms for a 100x100px block, ~70ms for 1680x1050). Most of the time seems to be spent in just the CGDataProviderCopyData call - I wonder if there's a way to access the data directly, since we dont need to modify it?
    • The ScreenPixel.pixel function is pretty quick, but accessing large numbers of pixels is still slow (since 0.01ms * 1650*1050 is about 17 seconds) - if you need to access lots of pixels, probably quicker to struct.unpack_from them all in one go.

    Here's the code:

    import time
    import struct
    
    import Quartz.CoreGraphics as CG
    
    
    class ScreenPixel(object):
        """Captures the screen using CoreGraphics, and provides access to
        the pixel values.
        """
    
        def capture(self, region = None):
            """region should be a CGRect, something like:
    
            >>> import Quartz.CoreGraphics as CG
            >>> region = CG.CGRectMake(0, 0, 100, 100)
            >>> sp = ScreenPixel()
            >>> sp.capture(region=region)
    
            The default region is CG.CGRectInfinite (captures the full screen)
            """
    
            if region is None:
                region = CG.CGRectInfinite
            else:
                # TODO: Odd widths cause the image to warp. This is likely
                # caused by offset calculation in ScreenPixel.pixel, and
                # could could modified to allow odd-widths
                if region.size.width % 2 > 0:
                    emsg = "Capture region width should be even (was %s)" % (
                        region.size.width)
                    raise ValueError(emsg)
    
            # Create screenshot as CGImage
            image = CG.CGWindowListCreateImage(
                region,
                CG.kCGWindowListOptionOnScreenOnly,
                CG.kCGNullWindowID,
                CG.kCGWindowImageDefault)
    
            # Intermediate step, get pixel data as CGDataProvider
            prov = CG.CGImageGetDataProvider(image)
    
            # Copy data out of CGDataProvider, becomes string of bytes
            self._data = CG.CGDataProviderCopyData(prov)
    
            # Get width/height of image
            self.width = CG.CGImageGetWidth(image)
            self.height = CG.CGImageGetHeight(image)
    
        def pixel(self, x, y):
            """Get pixel value at given (x,y) screen coordinates
    
            Must call capture first.
            """
    
            # Pixel data is unsigned char (8bit unsigned integer),
            # and there are for (blue,green,red,alpha)
            data_format = "BBBB"
    
            # Calculate offset, based on
            # http://www.markj.net/iphone-uiimage-pixel-color/
            offset = 4 * ((self.width*int(round(y))) + int(round(x)))
    
            # Unpack data from string into Python'y integers
            b, g, r, a = struct.unpack_from(data_format, self._data, offset=offset)
    
            # Return BGRA as RGBA
            return (r, g, b, a)
    
    
    if __name__ == '__main__':
        # Timer helper-function
        import contextlib
    
        @contextlib.contextmanager
        def timer(msg):
            start = time.time()
            yield
            end = time.time()
            print "%s: %.02fms" % (msg, (end-start)*1000)
    
    
        # Example usage
        sp = ScreenPixel()
    
        with timer("Capture"):
            # Take screenshot (takes about 70ms for me)
            sp.capture()
    
        with timer("Query"):
            # Get pixel value (takes about 0.01ms)
            print sp.width, sp.height
            print sp.pixel(0, 0)
    
    
        # To verify screen-cap code is correct, save all pixels to PNG,
        # using http://the.taoofmac.com/space/projects/PNGCanvas
    
        from pngcanvas import PNGCanvas
        c = PNGCanvas(sp.width, sp.height)
        for x in range(sp.width):
            for y in range(sp.height):
                c.point(x, y, color = sp.pixel(x, y))
    
        with open("test.png", "wb") as f:
            f.write(c.dump())
    
    0 讨论(0)
  • 2020-12-16 19:25

    I came across this post while searching for a solution to get screenshot in Mac OS X used for real-time processing. I have tried using ImageGrab from PIL as suggested in some other posts but couldn't get the data fast enough (with only about 0.5 fps).

    The answer https://stackoverflow.com/a/13024603/3322123 in this post to use PyObjC saved my day! Thanks @dbr!

    However, my task requires to get all pixel values rather than just a single pixel, and also to comment on the third note by @dbr, I added a new method in this class to get a full image, in case anyone else might need it.

    The image data are returned as a numpy array with dimension of (height, width, 3), which can be directly used for post-processing in numpy or opencv etc… getting individual pixel values from it also becomes pretty trivial using numpy indexing.

    I tested the code with a 1600 x 1000 screenshot - getting the data using capture() took ~30 ms and converting it to a np array getimage() takes only ~50 ms on my Macbook. So now I have >10 fps and even faster for smaller regions.

    import numpy as np
    
    def getimage(self):
        imgdata=np.fromstring(self._data,dtype=np.uint8).reshape(len(self._data)/4,4)
        return imgdata[:self.width*self.height,:-1].reshape(self.height,self.width,3)
    

    note I throw away the “alpha” channel from the BGRA 4 channel.

    0 讨论(0)
提交回复
热议问题