问题
I keep getting a null element when I use a CSS selector to find a form in a page.
final String LOGIN_FORM_URL = "https://student.naviance.com/sbrunswick";
Connection.Response loginFormResponse = Jsoup.connect(LOGIN_FORM_URL)
.method(Connection.Method.GET)
.userAgent(USER_AGENT)
.execute();
FormElement loginForm = (FormElement)loginFormResponse.parse().select("div#main-container > div.components-NewLogin-style-loginFormBody > form").first();
I've been trying forever with different CSS selectors to try and get the loginForm, but I keep getting the error that it's null.
If it helps, the link to the tutorial I'm using to learn web scraping: https://jsoup.programmingpedia.net/en/tutorial/4631/logging-into-websites-with-jsoup
I've tried all sorts of selectors such as the ones that follow:
div#main-container > div.components-NewLogin-style-loginFormBody > form
#main-container > div.components-NewLogin-style-loginFormBody > form
body > div > div > div > div > div > div > div > form
form.components-NewLogin-style-loginFormWrapper[data-test-id='login_form']
and more slight variations of these. However, I'm still getting the value of "loginForm" to be null. Can anyone please help me with this? I've been stuck on this type of issue for a while now.
This is the code from the website.
<html lang="en-US">
<head>
<title>Login | Naviance Student</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<link rel="apple-touch-icon" href="/apple-icon.png">
<link rel="apple-touch-icon" sizes="76x76" href="/apple-icon-76x76.png">
<link rel="apple-touch-icon" sizes="114x114" href="/apple-icon-114x114.png">
<link rel="apple-touch-icon" sizes="144x144" href="/apple-icon-144x144.png">
<link rel="apple-touch-icon" sizes="152x152" href="/apple-icon-152x152.png">
<link rel="apple-touch-icon" sizes="180x180" href="/apple-icon-180x180.png">
<link rel="apple-touch-startup-image" href="/apple-icon.png">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-title" content="Naviance Student">
<link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="96x96" href="/favicon-96x96.png">
<link rel="manifest" href="/manifest.json">
<meta http-equiv="Page-Enter" content="RevealTrans(Duration=2.0,Transition=2)">
<meta http-equiv="Page-Exit" content="RevealTrans(Duration=3.0,Transition=12)">
<meta http-equiv="cleartype" content="on">
<meta name="msapplication-config" content="IEconfig.xml">
<meta name="application-name" content="Naviance Student">
<meta name="author" content="Naviance">
<meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests">
<link href="/style-16726.css" rel="stylesheet">
<link rel="preload" href="/main.e6791.js" as="script">
<link rel="stylesheet" type="text/css" href="/0.style-c7e46.css">
<script type="text/javascript" async src="https://www.gstatic.com/recaptcha/releases/UFwvoDBMjc8LiYc1DKXiAomK/recaptcha__en.js" crossorigin="anonymous" integrity="sha384-K2LYnZEtBUcW6O6eiKyrX5HgXfaBzWmW7BmI0mEp+JFPi3pZyyiJwjMDjI12BtQg"></script>
<script type="text/javascript" async src="https://www.google-analytics.com/plugins/ua/linkid.js"></script>
<script type="text/javascript" async src="//bat.bing.com/bat.js"></script>
<script type="text/javascript" async src="//www.googleadservices.com/pagead/conversion_async.js"></script>
<script type="text/javascript" async src="https://www.google-analytics.com/analytics.js"></script>
<script async src="//www.googletagmanager.com/gtm.js?id=GTM-NPKP2M"></script>
<script charset="utf-8" src="/fc.common.f467c.js"></script>
<link rel="stylesheet" type="text/css" href="/56.style-ed477.css">
<script charset="utf-8" src="/fc.school-lookup.3f1be.js"></script>
<script src="https://googleads.g.doubleclick.net/pagead/viewthroughconversion/949855375/?random=1606140298936&cv=9&fst=1606140298936&num=1&guid=ON&resp=GooglemKTybQhCsO&u_h=1080&u_w=1920&u_ah=1040&u_aw=1920&u_cd=24&u_his=2&u_tz=-300&u_java=false&u_nplug=3&u_nmime=4&gtm=2wgb41&sendb=1&ig=1&frm=0&url=https%3A%2F%2Fstudent.naviance.com%2Fauth%2Ffclookup&ref=https%3A%2F%2Fwww.naviance.com%2F&tiba=Search%20for%20a%20School%20%7C%20Naviance%20Student&hn=www.googleadservices.com&async=1&rfmt=3&fmt=4"></script>
<link rel="stylesheet" type="text/css" href="/43.style-f1a23.css">
<script charset="utf-8" src="/fc.login.f56d4.js"></script>
</head>
<body data-new-gr-c-s-check-loaded="14.984.0" data-gr-ext-installed="">
<script src="/rewritten_config.js?v=1605811315155"></script>
<div id="root">
<div class="components-App-style-app">
<div>
<div style="height: 0px; width: 0px;"></div>
<div>
<div style="padding-bottom: 3rem;">
<a class="components-Header-styles-skipMain" href="#main-container">Skip to main content</a>
<header id="header" class="components-Header-styles-header" role="banner">
<div>
<nav class="components-Header-styles-nav" data-test-id="nav">
<div class="components-Header-styles-navContainer">
<a class="components-Header-styles-logoWrapper" href="/main"><img class="components-Header-styles-emblem" data-test-i
d="family_connection_emblem" src="/static/naviance-emblem-2c575.svg" alt="Logo" role="presentation"><img class="components-Header-styles-logo nophone" data-test-id="family_connection_header" src="/static/naviance-student-rgb-0a577.svg" alt="Naviance Student" role="img"></a>
</div>
</nav>
</div>
</header>
<div id="main-container" class="components-NewLogin-style-loginFormContainer">
<div class="components-NewLogin-style-loginFormBack">
<a href="/sbrunswick">
<figure class="components-Icon-style-icon">
<svg xmlns="http://www.w3.org/2000/svg" viewbox="0 0 19 30">
<path fill-rule="evenodd" d="M12.858 29.028c.212.23.556.23.761.008l3.502-3.783a.614.614 0 0 0-.006-.819l-8.812-9.432 8.824-9.621a.597.597 0 0 0-.021-.813l-3.472-3.6a.524.524 0 0 0-.776.008L.342 14.583a.622.622 0 0 0 0 .834l12.516 13.61z"></path>
</svg>
</figure> Back</a>
</div>
<div class="components-NewLogin-style-loginFormBody">
<h3 class="components-NewLogin-style-loginWelcome">Welcome Student!</h3>
<div class="components-NewLogin-style-userTypeImageContainer">
<img src="/static/backpack-cd9ef.svg">
</div>
<p data-test-id="login_to_naviance"><strong>Login to Naviance</strong></p>
<form class="components-NewLogin-style-loginFormWrapper" data-test-id="login_form">
<label class="components-NewLogin-style-loginInputLabel" for="login-username">Email </label>
<input id="login-username" name="username" type="email" class="components-NewLogin-style-loginInput" placeholder="For example navigator@naviance.com" data-test-id="username" value=""><label class="components-NewLogin-style-loginInputLabel" for="login-password">Password</label>
<input id="login-password" name="password" type="password" class="components-NewLogin-style-loginInput" placeholder="Type password" data-test-id="password" value="">
<div class="components-NewLogin-style-loginRememberForget">
<label for="checkbox_7" class="components-Checkbox-styles-label components-Checkbox-styles-light"><input data-test-id="remember_me" aria-label="Select checkbox_7" name="remember" id="checkbox_7" class="components-Checkbox-styles-input" type="checkbox" checked>
<figure class="components-Icon-style-icon components-Checkbox-styles-icon">
<svg width="1792" height="1792" viewbox="0 0 1792 1792" xmlns="http://www.w3.org/2000/svg">
<path d="M1671 566q0 40-28 68l-724 724-136 136q-28 28-68 28t-68-28l-136-136-362-362q-28-28-28-68t28-68l136-136q28-28 68-28t68 28l294 295 656-657q28-28 68-28t68 28l136 136q28 28 28 68z"></path>
</svg>
</figure>
<div class="components-Checkbox-styles-children">
Remember me
</div></label><a href="/sbrunswick/forgot-password">Forgot your password?</a>
</div>
<div>
<div>
<div class="grecaptcha-badge" data-style="bottomright" style="width: 256px; height: 60px; display: block; transition: right 0.3s ease 0s; position: fixed; bottom: 14px; right: -186px; box-shadow: gray 0px 0px 5px; border-radius: 2px; overflow: hidden;">
<div class="grecaptcha-logo">
<iframe src="https://www.google.com/recaptcha/api2/anchor?ar=1&k=6LfAN84UAAAAABfGTP7s2vIfa9lpQWoXg28LcQGV&co=aHR0cHM6Ly9zdHVkZW50Lm5hdmlhbmNlLmNvbTo0NDM.&hl=en&type=image&v=UFwvoDBMjc8LiYc1DKXiAomK&theme=light&size=invisible&badge=bottomright&cb=319m7d6n7h1b" width="256" height="60" role="presentation" name="a-831s6wkibtn5" frameborder="0" scrolling="no" sandbox="allow-forms allow-popups allow-same-origin allow-scripts allow-top-navigation allow-modals allow-popups-to-escape-sandbox"></iframe>
</div>
<div class="grecaptcha-error"></div><textarea id="g-recaptcha-response" name="g-recaptcha-response" class="g-recaptcha-response" style="width: 250px; height: 40px; border: 1px solid rgb(193, 193, 193); margin: 10px 25px; padding: 0px; resize: none; display: none;"></textarea>
</div><iframe style="display: none;"></iframe>
</div>
</div><button type="submit" class="components-NewLogin-style-btnNew components-NewLogin-style-loginBtn" disabled>Continue</button>
</form><a class="components-NewLogin-style-additionalHelp" target="_blank" rel="noopener noreferrer" href="https://student.naviance.com/additional-help">Need additional help?</a><a></a>
<p><a></a><a href="/sbrunswick/register">I'm new and need to register!</a></p>
</div>
</div>
</div>
<footer class="components-NewFooter-styles-footer">
<div class="components-NewFooter-styles-schoolInfo">
<div class="components-NewFooter-styles-west">
<div class="components-NewFooter-HobsonsBrand-styles-main">
<div>
<img class="components-NewFooter-HobsonsBrand-styles-hLogo" data-test-id="family_connection_header" src="/static/hobsons_w_tagline-fe51f.svg" alt="Hobsons">
</div>
<div class="components-NewFooter-HobsonsBrand-styles-linksDiv">
<span class="components-NewFooter-HobsonsBrand-styles-links"><a class="components-ClickHOC-styles-medium" href="/privacy-statement">Privacy Policy</a></span><span class="components-NewFooter-HobsonsBrand-styles-linkSeparator nophone"> | </span><span class="components-NewFooter-HobsonsBrand-styles-links"><a class="components-ClickHOC-styles-medium" href="/privacy-statement#ca">Your CA Privacy Rights</a></span>
</div>
<div class="components-NewFooter-HobsonsBrand-styles-copyright">
© 2020 Hobsons. All rights reserved worldwide.
</div>
</div>
</div>
<div class="components-NewFooter-styles-east">
<div class="components-NewFooter-UserInfo-styles-main">
<section class="card components-Card-styles-card components-NewFooter-UserInfo-styles-profileCard">
<div class="">
<div class="components-NewFooter-UserInfo-styles-profileSchool">
<div class="components-NewFooter-UserInfo-styles-schoolAddress">
<span><strong>South Brunswick High School</strong></span>
<div>
PO Box 183 750 Ridge Road
</div>
<div>
Monmouth Junction, NJ 08852-9721
</div>
<div>
<a href="tel:(732) 329-4044">p: (732) 329-4044</a>
</div>
<div>
<a href="http://www.sbschools.org/" target="_blank" rel="nofollow external noopener noreferrer">www.sbschools.org/</a>
</div>
</div>
</div>
</div>
</section>
</div>
</div>
</div>
</footer>
</div>
</div>
</div>
</div>
<script src="/fc.vendors~main.bb74e.js"></script>
<script src="/main.e6791.js" async></script>
<div class="ReactModalPortal"></div>
<div style="width:0px; height:0px; display:none; visibility:hidden;" id="batBeacon454285104070">
<img style="width:0px; height:0px; display:none; visibility:hidden;" id="batBeacon632230391008" width="0" height="0" alt="" src="https://bat.bing.com/action/0?ti=21008698&Ver=2&mid=21bd982f-afff-46e5-91b6-2a20e7b0ea84&sid=df4f0f802d9411eb9747713b1ab291c6&vid=b3661b601ac011ebb52bb79d756baba6&vids=0&pi=1200101525&lg=en-US&sw=1920&sh=1080&sc=24&tl=Search%20for%20a%20School%20%7C%20Naviance%20Student&p=https%3A%2F%2Fstudent.naviance.com%2Fauth%2Ffclookup&r=https%3A%2F%2Fwww.naviance.com%2F&lt=692&evt=pageLoad&msclkid=N&sv=1&rn=96110">
</div>
<script src="https://www.google.com/recaptcha/api.js?onload=onloadcallback&render=explicit" async></script>
<div style="visibility: hidden; position: absolute; width: 100%; top: -10000px; left: 0px; right: 0px; transition: visibility 0s linear 0.3s, opacity 0.3s linear 0s; opacity: 0;">
<div style="width: 100%; height: 100%; position: fixed; top: 0px; left: 0px; z-index: 2000000000; background-color: rgb(255, 255, 255); opacity: 0.5;"></div>
<div style="margin: 0px auto; top: 0px; left: 0px; right: 0px; position: absolute; border: 1px solid rgb(204, 204, 204); z-index: 2000000000; background-color: rgb(255, 255, 255); overflow: hidden;">
<iframe title="recaptcha challenge" src="https://www.google.com/recaptcha/api2/bframe?hl=en&v=UFwvoDBMjc8LiYc1DKXiAomK&k=6LfAN84UAAAAABfGTP7s2vIfa9lpQWoXg28LcQGV&cb=bcjuob9xkiu4" name="c-831s6wkibtn5" frameborder="0" scrolling="no" sandbox="allow-forms allow-popups allow-same-origin allow-scripts allow-top-navigation allow-modals allow-popups-to-escape-sandbox" style="width: 100%; height: 100%;"></iframe>
</div>
</div>
</body>
</html>
This is the code I'm getting when I log the HTML of loginFormResponse.
<html lang="en-US">
<head>
<title>Naviance Student</title>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1,minimum-scale=1">
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
<link rel="apple-touch-icon" href="/apple-icon.png">
<link rel="apple-touch-icon" sizes="76x76" href="/apple-icon-76x76.png">
<link rel="apple-touch-icon" sizes="114x114" href="/apple-icon-114x114.png">
<link rel="apple-touch-icon" sizes="144x144" href="/apple-icon-144x144.png">
<link rel="apple-touch-icon" sizes="152x152" href="/apple-icon-152x152.png">
<link rel="apple-touch-icon" sizes="180x180" href="/apple-icon-180x180.png">
<link rel="apple-touch-startup-image" href="/apple-icon.png">
<meta name="apple-mobile-web-app-capable" content="yes">
<meta name="apple-mobile-web-app-title" content="Naviance Student">
<link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png">
<link rel="icon" type="image/png" sizes="96x96" href="/favicon-96x96.png">
<link rel="manifest" href="/manifest.json">
<meta http-equiv="Page-Enter" content="RevealTrans(Duration=2.0,Transition=2)">
<meta http-equiv="Page-Exit" content="RevealTrans(Duration=3.0,Transition=12)">
<meta http-equiv="cleartype" content="on">
<meta name="msapplication-config" content="IEconfig.xml">
<meta name="application-name" content="Naviance Student">
<meta name="author" content="Naviance">
<meta http-equiv="Content-Security-Policy" content="upgrade-insecure-requests">
<link href="/style-16726.css" rel="stylesheet">
<link rel="preload" href="/main.e6791.js" as="script">
</head>
<body>
<script src="/rewritten_config.js?v=1605811315155"></script>
<div id="root"></div>
<script src="/fc.vendors~main.bb74e.js"></script>
<script src="/main.e6791.js" async></script>
</body>
</html>
来源:https://stackoverflow.com/questions/64947415/jsoup-select-form-returns-null