How to design a sequential hash-like function

后端 未结 6 417
谎友^
谎友^ 2020-12-14 04:57

I want to develop something similar to jsfiddle in where the user can input some data and then \"save\" it and get a unique random looking url that loads that data.

相关标签:
6条回答
  • 2020-12-14 05:18

    In my opinion if you also keeping the save time of entry on server, you can generate a hash function. hash = func(id, time) but with only hash = func(id) gonna be to easy to resolve

    0 讨论(0)
  • 2020-12-14 05:20

    It's an odd set of constraints. I routinely use MD5 checksums to generate unique URLs from data. If the user doesn't already have the data, they can't guess the URLs.

    I do understand about not wanting to use a database—if you've never used one before, the learning curve can be a little steep.

    I don't understand the constraint about "storing things sequentially on the server." If you need to know the order in which the hashes are created, I'd simply put that information in a separate file. You might have to do file locking or some other kind of hack to make sure you can append a hash to that file incrementally.

    If you want short URLs, you can either take a prefix of an MD5 checksum or you can take a CRC-32 and base64 encode it. Both will give you unique URLs with reasonably good probability.

    0 讨论(0)
  • 2020-12-14 05:21

    Here's how I implemented it. Here's the save.php file (can someone tell me if there are any design flaws in it):

    <?php
    
    $index = file_get_contents('saves/data/placeholder');
    $index++;
    file_put_contents('saves/data/placeholder', $index);
    
    $string = '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz';
    do {
        $hash = $string[rand(0, 61)] . $string[rand(0, 61)] . $string[rand(0, 61)] . $string[rand(0, 61)];
    } while (file_exists('saves/' . $hash));
    
    file_put_contents('saves/' . $hash, $index);
    file_put_contents('saves/data/' . $index, $_REQUEST['data']);
    
    echo $hash;
    
    ?>
    

    And here's load.php:

    <?php
    
    if (!file_exists('saves/' . $_REQUEST['file'])) {
        file_put_contents('saves/data/log', 'requested saves/' . $_REQUEST['file'] . "\n", FILE_APPEND);
        die();
    }
    $file_pointer = file_get_contents('saves/' . $_REQUEST['file']);
    
    if (!file_exists('saves/data/' . $file_pointer)) {
        file_put_contents('saves/data/log', 'requested saves/data/' . $file_pointer . 'from ' . $_REQUEST['file'] . "\n", FILE_APPEND);
        die();
    }
    echo file_get_contents('saves/data/' . $file_pointer);
    
    ?>
    

    Hope this helps others.

    0 讨论(0)
  • 2020-12-14 05:21

    This can't really be reversible. The only way (the one used by url shorteners and jsfiddle) is to store the generated hash (actually it's a digest) in a table/data structure of some sort and *look it up on retrieval.

    Why this?

    Passing from, e.g. 128 chars of data → a 4 visible char digest, you lose a lot of data.
    You cannot store the remaining data in the magical cracks betweeen those 4 bytes, there are none.

    0 讨论(0)
  • 2020-12-14 05:34

    It's possible to do this, but I would suggest using 64 characters, as that will make it a lot easier. 4 6bit characters = 24bits.

    Use a combination of these:

    • bit reordering
    • xor with a number
    • put it into a 24bit maximal length LFSR and do a couple of cycles.

    LFSR is highly recommended as it will do a good scrambling. The rest are optional. All of these manipulations are reversible and guarantee that each output is going to be unique.

    When you calculated the "shuffled" number simply pack it to a binary string and encode it with base64_encode.

    For decoding simply do the inverse of these operations.

    Sample (2^24 long unique sequence):

    function lfsr($x) {
        return ($x >> 1) ^ (($x&1) ? 0xe10000 : 0);
    }
    function to_4($x) {
        for($i=0;$i<24;$i++)
            $x = lfsr($x);
        $str = pack("CCC", $x >> 16, ($x >> 8) & 0xff, $x & 0xff);
        return base64_encode($str);
    }
    
    function rev_lfsr($x) {
        $bit = $x & 0x800000;
        $x = $x ^ ($bit ? 0xe10000 : 0);
        return ($x << 1) + ($bit ? 1 : 0);
    }
    function from_4($str) {
        $str = base64_decode($str);
        $x = unpack("C*", $str);
        $x = $x[1]*65536 + $x[2] * 256 + $x[3];
        for($i=0;$i<24;$i++)
            $x = rev_lfsr($x);
        return $x;
    }
    
    for($i=0; $i<256; $i++) {
        $enc = to_4($i);
        echo $enc . " " . from_4($enc) . "\n";
    }
    

    Output:

    AAAA 0
    kgQB 1
    5ggD 2
    dAwC 3
    DhAH 4
    nBQG 5
    6BgE 6
    ehwF 7
    HCAO 8
    jiQP 9
    +igN 10
    aCwM 11
    EjAJ 12
    gDQI 13
    9DgK 14
    ZjwL 15
    OEAc 16
    qkQd 17
    3kgf 18
    TEwe 19
    NlAb 20
    pFQa 21
    0FgY 22
    
    ...
    

    Note: for URL replace + and / with - and _.

    Note: although this works, for a simple scenario like yours it's probably easier to create a random filename, till it doesn't exist. nobody cares about the number of the entry.

    0 讨论(0)
  • 2020-12-14 05:38

    Here's a reversible lib that works w/ bcmath
    http://blog.kevburnsjr.com/php-unique-hash

    0 讨论(0)
提交回复
热议问题