Iterate twice on values (MapReduce)

前端 未结 11 976
轮回少年
轮回少年 2020-11-29 07:22

I receive an iterator as argument and I would like to iterate on values twice.

public void reduce(Pair key, Iterator          


        
相关标签:
11条回答
  • 2020-11-29 08:21

    If we are trying to iterate twice in Reducer as below

    ListIterator<DoubleWritable> lit = IteratorUtils.toListIterator(it);
    System.out.println("Using ListIterator 1st pass");
    while(lit.hasNext())
        System.out.println(lit.next());
    
    // move the list iterator back to start
    while(lit.hasPrevious())
        lit.previous();
    
    System.out.println("Using ListIterator 2nd pass");
    while(lit.hasNext())
        System.out.println(lit.next());
    

    We will only output as

    Using ListIterator 1st pass
    5.3
    4.9
    5.3
    4.6
    4.6
    Using ListIterator 2nd pass
    5.3
    5.3
    5.3
    5.3
    5.3
    

    Inorder to get it in the right way we should loop like this:

    ArrayList<DoubleWritable> cache = new ArrayList<DoubleWritable>();
     for (DoubleWritable aNum : values) {
        System.out.println("first iteration: " + aNum);
        DoubleWritable writable = new DoubleWritable();
        writable.set(aNum.get());
        cache.add(writable);
     }
     int size = cache.size();
     for (int i = 0; i < size; ++i) {
         System.out.println("second iteration: " + cache.get(i));
      }
    

    Output

    first iteration: 5.3
    first iteration: 4.9
    first iteration: 5.3
    first iteration: 4.6
    first iteration: 4.6
    second iteration: 5.3
    second iteration: 4.9
    second iteration: 5.3
    second iteration: 4.6
    second iteration: 4.6
    
    0 讨论(0)
  • 2020-11-29 08:22

    Reusing the given iterator, no.

    But you can save the values in an ArrayList when iterating through them in the first place and then iterating upon the constructed ArrayList, of course (or you can build it directly in the first place by using some fancy Collection methods and then iterating directly on the ArrayList twice. It's a matter of tastes).

    Anyway, are you sure passing an Iterator is a good thing in the first place? Iterators are used to do just a linear scan through the collection, this is why they don't expose a "rewind" method.

    You should pass something different, like a Collection<T> or an Iterable<T>, as already suggested in a different answer.

    0 讨论(0)
  • 2020-11-29 08:25

    We have to cache the values from the iterator if you want to iterate again. At least we can combine the first iteration and the caching:

    Iterator<IntWritable> it = getIterator();
    List<IntWritable> cache = new ArrayList<IntWritable>();
    
    // first loop and caching
    while (it.hasNext()) {
       IntWritable value = it.next();
       doSomethingWithValue();
       cache.add(value);
    }
    
    // second loop
    for(IntWritable value:cache) {
       doSomethingElseThatCantBeDoneInFirstLoop(value);
    }
    

    (just to add an answer with code, knowing that you mentioned this solution in your own comment ;) )


    why it's impossible without caching: an Iterator is something that implements an interface and there is not a single requirement, that the Iterator object actually stores values. Do iterate twice you either have to reset the iterator (not possible) or clone it (again: not possible).

    To give an example for an iterator where cloning/resetting wouldn't make any sense:

    public class Randoms implements Iterator<Double> {
    
      private int counter = 10;
    
      @Override 
      public boolean hasNext() { 
         return counter > 0; 
      }
    
      @Override 
      public boolean next() { 
         count--;
         return Math.random();        
      }      
    
      @Override 
      public boolean remove() { 
         throw new UnsupportedOperationException("delete not supported"); 
      }
    }
    
    0 讨论(0)
  • 2020-11-29 08:25

    Iterators are one-traversal-only. Some iterator types are cloneable, and you might be able to clone it before traversing, but this isn't the general case.

    You should make your function take an Iterable instead, if you can achieve that at all.

    0 讨论(0)
  • 2020-11-29 08:25

    if you want to change values as you go, i guess it's better to use listIterator then use its set() method.

    ListIterator lit = list.listIterator();
    while(lit.hasNext()){
       String elem = (String) lit.next();
       System.out.println(elem);
       lit.set(elem+" modified");
    }
    lit = null; 
    lit = list.listIterator();
    while(lit.hasNext()){
       System.out.println(lit.next());
    }
    

    Instead of calling .previous(), I just get another instance of the .listIterator() on the same list iterator object.

    0 讨论(0)
提交回复
热议问题