Problem with my Notebook run #21832 - Training doesnt happen , direct to test

mute-alica · October 15, 2024, 12:54pm

Hi ! I am Alica ! I love the contest and have recently got involved. It seems like after data preparation , the Cli doesn’t progress to training , rather jumps directly on test data !!! Can you please fix this issue?

enzo · October 15, 2024, 1:41pm

Hello,

I am not sure to understand what is the issue.

Your code first run on the 23500 train datasets.
And then run on the 9400 test datasets.

But after processing the test data, this is when you crashed.

Is is working with crunch.test()?
Will you be confortable by sharing some code here? If no, please send me a message on Discord for more help.

mute-alica · October 15, 2024, 3:03pm

OK , strange. Cause I use your posted default notebook for submission after test processing is complete. why crash after processing the test data ?hmmmmmmmm ?

enzo · October 15, 2024, 3:18pm

Did you changed anything? The team cannot access your code so it is hard to help you debug…
Could you make sure you submit the quickstarter with no modification?

mute-alica · October 16, 2024, 1:54am

#y_predicted = model.predict(lgb_test)
y_predicted = model.predict(X_test)
X_y_pred_test = X_group_test
X_y_pred_test["y_predicted"] = y_predicted

le = LabelEncoder()
le.classes_ = np.array([
    'Cause of X', 'Consequence of X', 'Confounder', 'Collider',
    'Mediator', 'Independent', 'Cause of Y', 'Consequence of Y',
])

X_y_pred_test["label_predicted"] = le.inverse_transform(y_predicted)

submission = create_submission(X_y_pred_test)

return pd.DataFrame(
    submission.items(),
    columns=[
        id_column_name,
        prediction_column_name
    ]
)

mute-alica · October 16, 2024, 1:56am

def create_submission(X_y_pred_test):
“”"
From the predicted test set, for each dataset, take predicted
classes of all variables, create the adjacency matrix, then create
the submission in the requested format.
“”"

submission = {}
for name, prediction in tqdm(X_y_pred_test.groupby("dataset"), delay=10):
    variables_labels = prediction[["variable", "label_predicted"]].set_index("variable")
    variables = variables_labels.index.tolist()
    variables_all = ["X", "Y"] + variables

    adjacency_matrix = pd.DataFrame(index=variables_all, columns=variables_all)
    adjacency_matrix.index.name = "parent"
    adjacency_matrix[:] = 0
    adjacency_matrix.loc["X", "Y"] = 1

    for v in variables:
        l = variables_labels.loc[v].item()
        if l == "Cause of X":
            adjacency_matrix.loc[v, "X"] = 1
        elif l == "Cause of Y":
            adjacency_matrix.loc[v, "Y"] = 1
        elif l == "Consequence of X":
            adjacency_matrix.loc["X", v] = 1
        elif l == "Consequence of Y":
            adjacency_matrix.loc["Y", v] = 1
        elif l == "Confounder":
            adjacency_matrix.loc[v, "X"] = 1
            adjacency_matrix.loc[v, "Y"] = 1
        elif l == "Collider":
            adjacency_matrix.loc["X", v] = 1
            adjacency_matrix.loc["Y", v] = 1
        elif l == "Mediator":
            adjacency_matrix.loc["X", v] = 1
            adjacency_matrix.loc[v, "Y"] = 1
        elif l == "Confounder":
            pass

    for i in variables_all:
        for j in variables_all:
            submission[f'{name}_{i}_{j}'] = int(adjacency_matrix.loc[i, j])

return submission

mute-alica · October 16, 2024, 1:57am

So if it actually does the calculations on the test data , and it all works , then based on the codes above , there should be no problem in creating submission file. I haven’t changed anything here

enzo · October 16, 2024, 10:50am

I am sorry, but it still hard to debug…

Your error was:

There is no trained model to use predict(). Use fit() to train model. Then use this method.

Likely meaning that your loaded model is empty?

Topic		Replies	Views
Crunch.test() returns incorrect X_train, y_train data DataCrunch 2025	6	33	June 2, 2025
Random_submission.ipynb: train() function very confusing Broad Institute Crunch #1	14	139	December 2, 2024
It says the model not found Causality Discovery	1	20	October 21, 2024
Crunch 1 deliverables - CSV or Notebook with training function Broad Institute Crunch #1	3	118	December 2, 2024
Regarding code for submission ADIA Lab	5	420	June 7, 2023

Problem with my Notebook run #21832 - Training doesnt happen , direct to test

Related topics