Finding next input element using Mechanize?

后端未结

关注

 3  978

Using Mechanize, is it possible to find a phrase in the HTML of a page, for example, \"email\", and find the next after that, and fill in that input


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  礼貌的吻别        
                
              
                            
                2021-01-20 22:10
              
            
            
                                                                       
For a well-formed HTML page, an input element should have a label showing what the input is for. In this case, you can iterate all label, finding the one containing text "email", and get the associated input by the for attribute of the label.

However, not all HTML page are well-formed. No label, no for attribute, or other ill-formed issues.

If you mean the input right after some element in the DOM. You can do some DOM traversal to find whether an element containing "email" has an input element next to it.

If you mean the input next to an element in the rendered page, you should define what is "next to". And I think you cannot get what you want without great efforts. Some element located after the element "email" might be placed before it with some CSS trick. You need some graphical API to find that input. However, I don't see that in watir's API documentation.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长情又很酷        
                
              
                            
                2021-01-20 22:14
              
            
            
                                                                       
Mechanize uses Nokogiri internally to handle its DOM parsing, which is the basis of its ability to locate different elements in a page.

It's possible to access the parsed DOM, and, through it use Nokogiri to locate elements Mechanize doesn't normally let us find. For instance:

require 'mechanize'

agent = Mechanize.new
page = agent.get('http://www.example.com')

# Use Nokogiri to find the content of the <h1> tag...
puts page.at('h1').content # => "Example Domain"


For your search you'd want to use an XPath accessor to locate where "email" is in the page. Once you've done that you can locate the next <input> tag. 

Starting from a simple HTML fragment, we'll pretend this comes from Mechanize:

page = Nokogiri::HTML('<div><form><p>email</p><input name="email"></form></div>')
puts page.to_html


Which looks like:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><div><form>
<p>email</p>
<input name="email">
</form></div></body></html>


Searching for "email":

page.at("//*[contains(text(),'email')]")
#<Nokogiri::XML::Element:0x3ff50d0c4bc0 name="p" children=[#<Nokogiri::XML::Text:0x3ff50d0c497c "email">]>


Building upon that, this gets the <input> tag:

input_tag = page.at("//*[contains(text(),'email')]/following-sibling::input")
#<Nokogiri::XML::Element:0x3ff50d09b75c name="input" attributes=[#<Nokogiri::XML::Attr:0x3ff50d09b5f4 name="name" value="email">]>


Once you've found that input tag, you can get the "name" from the tag using Nokogiri, and then tell Mechanize to locate and fill in that particular input field:

input_tag['name']
=> "email"


For a web form to function correctly, it has to have names for the elements. Those get passed to the server when the form is submitted. Without the names it'd take a lot of work to determine which input sent a particular piece of data, and, programmers being lazy, we don't want to work hard, so you can count on having a name to work with.

See "Ruby Mechanize, Nokogiri and Net::HTTP" for more information, plus a search of Stack Overflow, and reading the Nokogiri documenation and tutorials will give you lots of needed information for figuring out how to do the rest.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  攒了一身酷        
                
              
                            
                2021-01-20 22:26
              
            
            
                                                                       
First find the element with the phrase text:

el = page.at('*[text()*="some phrase"]')


From there you can get the first following input:

input = el.at('./following::input')


Now, find the ancestor form node of that input:

form_node = input.ancestors('form')[0]


Then use that to get the Mechanize::Form object

form = page.form_with(:form_node => form_node)


And now you can fill out the value

form[input[:name]] = 'foo'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复