Problem with my Notebook run #21832 - Training doesnt happen , direct to test

Hi ! I am Alica ! I love the contest and have recently got involved. It seems like after data preparation , the Cli doesn’t progress to training , rather jumps directly on test data !!! Can you please fix this issue?

Hello,

I am not sure to understand what is the issue.

Your code first run on the 23500 train datasets.
And then run on the 9400 test datasets.

But after processing the test data, this is when you crashed.

Is is working with crunch.test()?
Will you be confortable by sharing some code here? If no, please send me a message on Discord for more help.

OK , strange. Cause I use your posted default notebook for submission after test processing is complete. why crash after processing the test data ?hmmmmmmmm ?

Did you changed anything? The team cannot access your code so it is hard to help you debug…
Could you make sure you submit the quickstarter with no modification?

#y_predicted = model.predict(lgb_test)
y_predicted = model.predict(X_test)
X_y_pred_test = X_group_test
X_y_pred_test["y_predicted"] = y_predicted

le = LabelEncoder()
le.classes_ = np.array([
    'Cause of X', 'Consequence of X', 'Confounder', 'Collider',
    'Mediator', 'Independent', 'Cause of Y', 'Consequence of Y',
])

X_y_pred_test["label_predicted"] = le.inverse_transform(y_predicted)

submission = create_submission(X_y_pred_test)

return pd.DataFrame(
    submission.items(),
    columns=[
        id_column_name,
        prediction_column_name
    ]
)

def create_submission(X_y_pred_test):
“”"
From the predicted test set, for each dataset, take predicted
classes of all variables, create the adjacency matrix, then create
the submission in the requested format.
“”"

submission = {}
for name, prediction in tqdm(X_y_pred_test.groupby("dataset"), delay=10):
    variables_labels = prediction[["variable", "label_predicted"]].set_index("variable")
    variables = variables_labels.index.tolist()
    variables_all = ["X", "Y"] + variables

    adjacency_matrix = pd.DataFrame(index=variables_all, columns=variables_all)
    adjacency_matrix.index.name = "parent"
    adjacency_matrix[:] = 0
    adjacency_matrix.loc["X", "Y"] = 1

    for v in variables:
        l = variables_labels.loc[v].item()
        if l == "Cause of X":
            adjacency_matrix.loc[v, "X"] = 1
        elif l == "Cause of Y":
            adjacency_matrix.loc[v, "Y"] = 1
        elif l == "Consequence of X":
            adjacency_matrix.loc["X", v] = 1
        elif l == "Consequence of Y":
            adjacency_matrix.loc["Y", v] = 1
        elif l == "Confounder":
            adjacency_matrix.loc[v, "X"] = 1
            adjacency_matrix.loc[v, "Y"] = 1
        elif l == "Collider":
            adjacency_matrix.loc["X", v] = 1
            adjacency_matrix.loc["Y", v] = 1
        elif l == "Mediator":
            adjacency_matrix.loc["X", v] = 1
            adjacency_matrix.loc[v, "Y"] = 1
        elif l == "Confounder":
            pass

    for i in variables_all:
        for j in variables_all:
            submission[f'{name}_{i}_{j}'] = int(adjacency_matrix.loc[i, j])

return submission

So if it actually does the calculations on the test data , and it all works , then based on the codes above , there should be no problem in creating submission file. I haven’t changed anything here

I am sorry, but it still hard to debug…

Your error was:

There is no trained model to use predict(). Use fit() to train model. Then use this method.

Likely meaning that your loaded model is empty?