five

Patched Phase Diagram

收藏
DataCite Commons2025-11-01 更新2026-02-09 收录
下载链接:
https://figshare.com/articles/dataset/Patched_Phase_Diagram/30508715/3
下载链接
链接失效反馈
官方服务:
资源简介:
ppd-mp_all_entries_uncorrected_250409.pkl.gz` is a PatchedPhaseDiagram object created from all entries in the Materials Project database with uncorrected energies. The object is saved in a gzipped pickle format.<br><br>``` pythonimport osimport gzipimport pickleimport dotenv<br>from pymatgen.ext.matproj import MPResterfrom pymatgen.analysis.phase_diagram import PatchedPhaseDiagram, PDEntry<br>dotenv.load_dotenv()mp_api_key = os.getenv("MP_API_KEY")<br># Collect all entries from the Materials Project databaseall_entries = MPRester(mp_api_key).get_entries("", compatible_only=True)print(f"Found {len(all_entries)} entries in the Materials Project database")<br>all_entries = [e for e in all_entries if e.data["run_type"] in ["GGA", "GGA_U"]]print(f"Found {len(all_entries)} entries with GGA or GGA_U run type")<br># Save energy with uncorrected energiesall_entries_uncorrected = [ PDEntry(composition=e.composition, energy=e.uncorrected_energy) for e in all_entries]print(f"Found {len(all_entries_uncorrected)} entries with uncorrected energies")<br><br># Create PatchedPhaseDiagramppd_mp = PatchedPhaseDiagram(all_entries_uncorrected, verbose=True) # type: ignoreprint(f"PatchedPhaseDiagram created with {len(ppd_mp.all_entries)} entries")<br># Save the PatchedPhaseDiagram objectwith gzip.open( "assets/ppd-mp_all_entries_uncorrected_250409.pkl.gz", "wb") as f: pickle.dump(ppd_mp, f)print( "PatchedPhaseDiagram object saved as assets/ppd-mp_all_entries_uncorrected_250409.pkl.gz")```<br>- usage:<br>``` pythonimport gzipimport pickle<br>from pymatgen.analysis.phase_diagram import PatchedPhaseDiagram<br>with gzip.open( "assets/ppd-mp_all_entries_uncorrected_250409.pkl.gz", "rb") as f: ppd_mp = pickle.load(f)e_above_hull = ppd_mp.get_e_above_hull(new_entry, allow_negative=True)```<br>## 1.2. All unique structures from the Materials Project database<br>- `mp_all_unique_structure_250416.json.gz` is a list of unique structures from the Materials Project database. The structures are saved in a gzipped JSON format. (148854 unique structures)<br>```pythonimport osimport dotenv<br>from monty.serialization import loadfn, dumpfnfrom pymatgen.analysis.structure_matcher import StructureMatcherfrom pymatgen.ext.matproj import MPRester<br>dotenv.load_dotenv()mp_api_key = os.getenv("MP_API_KEY")assert mp_api_key is not None<br><br>with MPRester(mp_api_key) as mpr: docs = mpr.materials.search(material_ids={}, fields=["structure"])<br><br>st_list = [doc.structure for doc in docs]print(f"Found {len(st_list)} structures")<br># Unique structuressm = StructureMatcher()output = sm.group_structures(st_list)unique_st_list = [o[0] for o in output]print(f"Found {len(unique_st_list)} unique structures")<br><br># savedumpfn(unique_st_list, "assets/mp_all_unique_structure_250416.json.gz")```<br>## 1.3. Random sampled 1000 structures in mp-20 test set for CSP task- `csp_test_sampled_1000_compositions.txt` is a list of 1000 random sampled compositions from mp-20 test set. The compositions are saved in a text format for evaluation. (1000 unique compositions)```pythondf_test = pd.read_csv("../data/mp-20/test.csv", index_col=0)print(len(df_test))<br># Remove duplicates compositionsdf_test = df_test.drop_duplicates(subset=["composition"])print(len(df_test))<br># Change original composition to raw_compositiondf_test = df_test.rename(columns={"composition": "raw_composition"})<br># Select 1000 random samplesn_sample = 1000df_test_sampled = df_test.sample(n_sample, random_state=0)df_test_sampled = df_test_sampled.reset_index(drop=True)<br># Get compositions for CSPst_list = [Structure.from_str(s, fmt="cif") for s in df_test_sampled["cif"]]df_test_sampled["num_atoms"] = [len(st) for st in st_list]df_test_sampled["composition"] = [ str(st.composition).replace(" ", "") for st in st_list]print(len(df_test_sampled))<br># Save to txtwith open("assets/csp_test_sampled_1000_compositions.txt", "w") as f: for c in df_test_sampled["composition"]: f.write(c + "\n")```
提供机构:
figshare
创建时间:
2025-11-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作