python 질문드립니다.
본문
안녕하세요.
물어볼데가 없어서 여기에 질문드립니다.
다른 txt는 다 가져올수 있는데 data-index= 값을 어떻게 가져와야 되는지 알수가 없네요
감사합니다.
html 소스
<ul class="flex flex-wrap">
<li class="w-full Episode_episodeItem__Hjwb9 relative mb-1 !w-[calc((100%-2px)/3)] lg:!w-[calc((100%-4px)/5)]" data-index="0">
<img src="https://aaa/33583d6f-90ff-4708-a28c-57d299db7b58.jpg">
</li>
<li class="w-full Episode_episodeItem__Hjwb9 relative mb-1 !w-[calc((100%-2px)/3)] lg:!w-[calc((100%-4px)/5)]" data-index="1">
<img src="https://aaa/33583d6f-90ff-4708-a28c-57d299db7b59.jpg">
</li>
</ul>
python 소스
for c in soup.select('li[class="w-full Episode_episodeItem__Hjwb9 relative mb-1 !w-[calc((100%-2px)/3)] lg:!w-[calc((100%-4px)/5)]"]'):
episode_image = c.select_one('img')['src']
episode_no = c.select_one('data-index').text
답변 2
한번 참고해 보시겠어요.
from bs4 import BeautifulSoup
html_doc = '''
<ul class="flex flex-wrap">
<li class="w-full Episode_episodeItem__Hjwb9 relative mb-1 !w-[calc((100%-2px)/3)] lg:!w-[calc((100%-4px)/5)]" data-index="0">
<img src="https://aaa/33583d6f-90ff-4708-a28c-57d299db7b58.jpg">
</li>
<li class="w-full Episode_episodeItem__Hjwb9 relative mb-1 !w-[calc((100%-2px)/3)] lg:!w-[calc((100%-4px)/5)]" data-index="1">
<img src="https://aaa/33583d6f-90ff-4708-a28c-57d299db7b59.jpg">
</li>
</ul>
'''
soup = BeautifulSoup(html_doc, 'html.parser')
for c in soup.select('li[data-index]'):
episode_image = c.select_one('img')['src']
episode_no = c['data-index']
print(episode_image, episode_no)
한번 참고해 보시겠어요.
from bs4 import BeautifulSoup
html_doc = '''
<ul class="flex flex-wrap">
<li class="w-full Episode_episodeItem__Hjwb9 relative mb-1 !w-[calc((100%-2px)/3)] lg:!w-[calc((100%-4px)/5)]" data-index="0">
<img src="https://aaa/33583d6f-90ff-4708-a28c-57d299db7b58.jpg">
</li>
<li class="w-full Episode_episodeItem__Hjwb9 relative mb-1 !w-[calc((100%-2px)/3)] lg:!w-[calc((100%-4px)/5)]" data-index="1">
<img src="https://aaa/33583d6f-90ff-4708-a28c-57d299db7b59.jpg">
</li>
</ul>
'''
soup = BeautifulSoup(html_doc, 'html.parser')
for c in soup.select('li[data-index]'):
episode_image = c.select_one('img')['src']
episode_no = c['data-index']
print(episode_image, episode_no)